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NOVEL POLYNUCLEOTIDES 

The present application claims benefit of Japanese Patent Application Nos. 
Hei. 11-377484 (filed December 16, 1999), 2000-159162 (filed April 7, 2000) 
and 2000-280988 (filed August 3, 2000), the entire contents of each of which 
is incorporated herein by reference. 

The contents of the attached CD-R compact discs are incorporated herein by 

reference in their entirety. The attached discs contain an identical 

copy of a file "SEQ2.\xT" which were created on the discs on December 13, 2000, 

and are each 25,891 KB\ 

x BACKGROUND OF THE INVENTION 

1 . Field of the Invention 

The present invention relates to novel 
polynucleotides derived from microorganisms belonging to 
coryneform bacteria and fragments thereof, polypeptides 
encoded by the polynucleotides and fragments thereof, 
polynucleotide arrays comprising the polynucleotides and 
fragments thereof, computer readable recording media in 
which the nucleotide sequences of the polynucleotide and 
fragments thereof have been recorded, and use of them as 
well as a method of using the polynucleotide and/or 
polypeptide sequence information to make comparisons . 

2 . Brief Description of the Background Art 

Coryneform bacteria are used in producing various 
useful substances, such as amino acids, nucleic acids, 
vitamins, saccharides (for example, ribulose) , organic 
acids (for example, pyruvic acid) , and analogues of the 
above-described substances (for example, N-acetylamino 
acids) and are very useful microorganisms industrially. 
Many mutants thereof are known . 

For example, Cozynebacterlum glutamlcum is a Gram- 
positive bacterium identified as a glutamic acid-producing 
bacterium, and many amino acids are produced by mutants 
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thereof. For example, 1,000,000 ton/year of L-glutamic 
acid which is useful as a seasoning for umami (delicious 
taste), 250,000 ton/year of L-lysine which is a valuable 
additive for livestock feeds and the like, and several 
hundred ton/year or more of other amino acids , such as L- 
arginine, L-proline, L-glutamine, L- tryptophan , and the 
like, have been produced in the world (Nikkei Bio Yearbook 
99, published by Nikkei BP (1998)) . 
Jn The production of amino acids by Corynebacterlum 

\- s glutamlcum is mainly carried out by its mutants (metabolic 

j*f mutants) which have a mutated metabolic pathway and 

\ll regulatory systems. In general, an organism is provided 

with various metabolic regulatory systems so as not to 
I— produce more amino acids than it needs. In the 

H biosynthesis of L-lysine, for example, a microorganism 

lips 

n belonging to the genus Corynebacterlum is under such 

"~ regulation as preventing the excessive production by 

concerted inhibition by lysine and threonine against the 
activity of a biosynthesis enzyme common to lysine, 
threonine and methionine, i.e., an aspartokinase , (J. 
Blochem. , 65: 849-859 (1969)). The biosynthesis of 

arginine is controlled by repressing the expression of its 
biosynthesis gene by arginine so as not to biosynthesize an 
excessive amount of arginine (Microbiology, 142: 99-108 
(1996)). It is considered that these metabolic regulatory 
mechanisms are deregulated in amino acid-producing mutants. 
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Similarly, the metabolic regulation is deregulated in 
mutants producing nucleic acids, vitamins, saccharides, 
organic acids and analogues of the above -de scribed 
substances so as to improve the productivity of the 
objective product. 

However, accumulation of basic genetic, biochemical 
and molecular biological data on coryneform bacteria is 
insufficient in comparison with Escherichia, coll, Bacillus 

13 subtllls, and the like. Also, few findings have been 

'C\ obtained on mutated genes in amino acid-producing mutants. 

Thus, there are various mechanisms, which are still unknown, 

III of regulating the growth and metabolism of these 

] v s mi cr oor gani sms . 

in A chromosomal physical map of Corynebacterlum 

i y 

H glutamlcum ATCC 13032 is reported and it is known that its 

G genome size is about 3,100 kb (Mol. Gen. Genet., 252: 255- 

O 

I 265 (1996) ) . Calculating on the basis of the usual gene 

density of bacteria, it is presumed that about 3,000 genes 
are present in this genome of about 3,100 kb. However, 
only about 100 genes mainly concerning amino acid 
biosynthesis genes are known in CoryneJbacterium glutamlcum, 
and the nucleotide sequences of most genes have not been 
clarified hitherto. 

In recent years, the full nucleotide sequence of 
the genomes of several microorganisms, such as Escherichia 
coll, Mycohacterlum tuberculosis, yeast, and the like, have 
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been determined (Science, 277: 1453-62 (1997) ; Mature, 
393: 537-544 (1998); Nature, 387: 5-105 (1997)). Based on 
the thus determined full nucleotide sequences, assumption 
of gene regions and prediction of their function by 
comparison with the nucleotide sequences of known genes 
have been carried out. Thus, the functions of a great 
number of genes have been presumed, without genetic, 
biochemical or molecular biological experiments. 

In recent years, moreover, techniques for 
monitoring expression levels of a great number of genes 
simultaneously or detecting mutations, using DNA chips, DNA 
arrays or the like in which a partial nucleic acid fragment 
of a gene or a partial nucleic acid fragment in genomic DNA 
other than a gene is fixed to a solid support, have been 
developed. The techniques contribute to the analysis of 
microorganisms, such as yeasts, Mycobacterium tuberculosis, 
Mycobacterium bovls used in BCG vaccines, and the like 
(Science, 278: 680-686 (1997) ; Proc. Natl. Acad. Scl . USA, 
96: 12833-38 (1999); Science, 284: 1520-23 (1999)). 

SUMMARY OF THE INVENTION 
An object of the present invention is to provide a 
polynucleotide and a polypeptide derived from a 
microorganism of coryneform bacteria which are industrially 
useful, sequence information of the polynucleotide and the 
polypeptide, a method for analyzing the microorganism, an 



apparatus and a system for use in the analysis, and a 
method for breeding the microorganism. 

The present invention provides a polynucleotide and 
an oligonucleotide derived from a microorganism belonging 
to coryneform bacteria, oligonucleotide arrays to which the 
polynucleotides and the oligonucleotides are fixed, a 
polypeptide encoded by the polynucleotide, an antibody 
which recognizes the polypeptide, polypeptide arrays to 
which the polypeptides or the antibodies are fixed, a 
computer readable recording medium in which the nucleotide 
sequences of the polynucleotide and the oligonucleotide and 
the amino acid sequence of the polypeptide have been 
recorded, and a system based on the computer using the 
recording medium as well as a method of using the 
polynucleotide and/or polypeptide sequence information to 
make comparisons. 

BRIEF DESCRIPTION OF THE DRAWING 
Fig. 1 is a map showing the positions of typical 
genes on the genome of Corynebactexrium glutamlcum ATCC 
13032. 

Fig. 2 is electrophoresis showing the results of 
proteome analyses using proteins derived from (A) 
Corynebacterlum glutamlcum ATCC 13032, (B) FERM BP-7134, 
and (C) FERM BP-158. 



Fig. 3 is a flow chart of an example of a system 
using the computer readable media according to the present 
invention . 

Fig. 4 is a flow chart of an example of a system 
using the computer readable media according to the present 
invention . 

DETAILED DESCRIPTION OF THE INVENTION 
This application is based on Japanese applications 
No. Hei. 11-377484 filed on December 16, 1999, No. 2000- 
159162 filed on April 7, 2000 and No. 2000-280988 filed on 
August 3, 2000, the entire contents of which are 
incorporated hereinto by reference. 

From the viewpoint that the determination of the 
full nucleotide sequence of Cor*yneJbactez*ium glutamlcwn 
would make it possible to specify gene regions which had 
not been previously identified, to determine the function 
of an unknown gene derived from the microorganism through 
comparison with nucleotide sequences of known genes and 
amino acid sequences of known genes, and to obtain a useful 
mutant based on the presumption of the metabolic regulatory 
mechanism of a useful product by the microorganism, the 
inventors conducted intensive studies and, as a result, 
found that the complete genome sequence of Coryn eJba c terl vm 
glutamlcum can be determined by applying the whole genome 
shotgun method. 
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Specifically, the present invention relates to the 
following (1) to (65) : 

(1) A method for at least one of the following: 

(A) identifying a mutation point of a gene derived from 
a mutant of a coryneform bacterium, 

(B) measuring an expression amount of a gene derived 
from a coryneform bacterium, 

(C) analyzing an expression profile of a gene derived 
from a coryneform bacterium, 

(D) analyzing expression patterns of genes derived from 
a coryneform bacterium, or 

(E) identifying a gene homologous to a gene derived 
from a coryneform bacterium, 

said method comprising: 

(a) producing a polynucleotide array by adhering to a 
solid support at least two polynucleotides selected from 
the group consisting of first polynucleotides comprising 
the nucleotide sequence represented by any one of SEQ ID 
NOS:l to 3501, second polynucleotides which hybridize with 
the first polynucleotides under stringent conditions, and 
third polynucleotides comprising a sequence of 10 to 200 
continuous bases of the first or second polynucleotides, 

(b) incubating the polynucleotide array with at least 
one of a labeled polynucleotide derived from a coryneform 
bacterium, a labeled polynucleotide derived from a mutant 



of the coryneform bacterium or a labeled polynucleotide to 
be examined, under hybridization conditions, 

(c) detecting any hybridization, and 

(d) analyzing the result of the hybridization. 

As used herein, for example, the at least two 
polynucleotides can be at least two of the first 
polynucleotides, at least two of the second polynucleotides, 
at least two of the third polynucleotides, or at least two 
of the first, second and third polynucleotides. 

(2) The method according to (1) , wherein the coryneform 
bacterium is a microorganism belonging to the genus 
Coryneba c terl um , the genus Brevlbacterlum , or the genus 
MicroJbacterium . 

(3) The method according to (2) , wherein the 
microorganism belonging to the genus Corynebactexrlum is 
selected from the group consisting of Corynebacterlum 
glutamlcum , Coxrynebacteirlum acetoacldophllum , 
Corynebacterlum acetog-lutamlcum , Corynebacterlvim callunae , 
Corynebacterlum herculls , Corynebacterlum llllum, 
Corynebacterlvtm melassecola , Coryneba c terl um 
thermoamlnogenes , and Corynebactexrivun amm onlacfenes . 

(4) The method according to (1) , wherein the 
polynucleotide derived from a coryneform bacterium, the 
polynucelotide derived from a mutant of the coryneform 
bacterium or the polynucleotide to be examined is a gene 
relating to the biosynthesis of at least one compound 



selected from an amino acid, a nucleic acid, a vitamin, a 
saccharide, an organic acid, and analogues thereof. 

(5) The method according to (1) , wherein the 
polynucleotide to be examined is derived from Escherichia 
coll . 

(6) A polynucleotide array, comprising: 

at least two polynucleotides selected from the 
group consisting of first polynucleotides comprising the 
nucleotide sequence represented by any one of SEQ ID NOS : 1 
to 3501, second polynucleotides which hybridize with the 
first polynucleotides under stringent conditions, and third 
polynucleotides comprising 10 to 200 continuous bases of 
the first or second polynucleotides, and 

a solid support adhered thereto. 

As used herein, for example, the at least two 
polynucleotides can be at least two of the first 
polynucleotides, at least two of the second polynucleotides, 
at least two of the third polynucleotides, or at least two 
of the first, second and third polynucleotides. 

(7) A polynucleotide comprising the nucleotide sequence 
represented by SEQ ID NO : 1 or a polynucleotide having a 
homology of at least 80% with the polynucleotide. 

(8) A polynucleotide comprising any one of the 
nucleotide sequences represented by SEQ ID NOS : 2 to 3431, 
or a polynucleotide which hybridizes with the 
polynucleotide under stringent conditions. 
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(9) A polynucleotide encoding a polypeptide having any 
one of the amino acid sequences represented by SEQ ID 
NOS:3502 to 6931, or a polynucleotide which hybridizes 
therewith under stringent conditions. 

(10) A polynucleotide which is present in the 5' 
upstream or 3' downstream of a polynucleotide comprising 
the nucleotide sequence of any one of SEQ ID NOS:2 to 3431 
in a whole polynucleotide comprising the nucleotide 
sequence represented by SEQ ID NO:l, and has an activity of 
regulating an expression of the polynucleotide. 

(11) A polynucleotide comprising 10 to 200 continuous 
bases in the nucleotide sequence of the polynucleotide of 
any one of (7) to (10) , or a polynucleotide comprising a 
nucleotide sequence complementary to the polynucleotide 
comprising 10 to 200 continuous based. 

(12) A recombinant DNA comprising the polynucleotide of 
any one of (8) to (11) . ' 

(13) A trans formant comprising the polynucleotide of any 
one of (8) to (11) or the recombinant DNA of (12) . 

(14) A method for producing a polypeptide, comprising: 
culturing the transformant of (13) in a medium to 

produce and accumulate a polypeptide encoded by the 
polynucleotide of (8) or (9) in the medium, and 

recovering the polypeptide from the medium. 



- 10 - 



(15) A method for producing at least one of an amino 
acid, a nucleic acid, a vitamin, a saccharide, an organic 
acid, and analogues thereof, comprising; 

culturing the transformant of (13) in a medium to 
produce and accumulate at least one of an amino acid, a 
nucleic acid, a vitamin, a saccharide, an organic acid, and 
analogues thereof in the medium, and 

recovering the at least one of the amino acid, the 
nucleic acid, the vitamin, the saccharide, the organic acid, 
and analogues thereof from the medium. 

(16) A polypeptide encoded by a polynucleotide 
comprising the nucleotide sequence selected from SEQ ID 
NOS:2 to 3431 . 

(17) A polypeptide comprising the amino acid sequence 
selected from SEQ ID NOS:3502 to 6931. 

(18) The polypeptide according to (16) or (17) , wherein 
at least one amino acid is deleted, replaced, inserted or 
added, said polypeptides having an activity which is 
substantially the same as that of the polypeptide without 
said at least one amino acid deletion, replacement, 
insertion or addition. 

(19) A polypeptide comprising an amino acid sequence 
having a homology of at least 60% with the amino acid 
sequence of the polypeptide of (16) or (17) , and having an 
activity which is substantially the same as that of the 
polypeptide . 



(20) An antibody which recognizes the polypeptide of any 
one of (16) to (19) . 

(21) A polypeptide array, comprising: 

at least one polypeptide or partial fragment 
polypeptide selected from the polypeptides of (16) to (19) 
and partial fragment polypeptides of the polypeptides, and 

a solid support adhered thereto. 

(22) A polypeptide array, comprising: 

at least one antibody which recognizes a 
polypeptide or partial fragment polypeptide selected from 
the polypeptides of (16) to (19) and partial fragment 
polypeptides of the polypeptides, and 

a solid support adhered thereto. 

(23) A system based on a computer for identifying a 
target sequence or a target structure motif derived from a 
coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one 
nucleotide sequence information selected from SEQ ID NOS : 1 
to 3501, and target sequence or target structure motif 
information ; 

(ii) a data storage device for at least temporarily 
storing the input information; 

(iii) a comparator that compares the at least one 
nucleotide sequence information selected from SEQ ID NOS : 1 
to 3501 with the target sequence or target structure motif 
information, recorded by the data storage device for 
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screening and analyzing nucleotide sequence information 
which is coincident with or analogous to the target 
sequence or target structure motif information; and 
(iv) an output device that shows a screening or 
analyzing result obtained by the comparator. 

(24) A method based on a computer for identifying a 
target sequence or a target structure motif derived from a 
coryneform bacterium, comprising the following: 

(i) inputting at least one nucleotide sequence 
information selected from SEQ ID NOS : 1 to 3501, target 
sequence information or target structure motif information 
into a user input device; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one nucleotide sequence 
information selected from SEQ ID NOS : 1 to 3501 with the 
target sequence or target structure motif information; and 

(iv) screening and analyzing nucleotide sequence 
information which is coincident with or analogous to the 
target sequence or target structure motif information. 

(25) A system based on a computer for identifying a 
target sequence or a target structure motif derived from a 
coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one amino 

acid sequence information selected from SEQ ID NOS: 3502 to 
7001, and target sequence or target structure motif 
information ; 
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(ii) a data storage device for at least temporarily 
storing the input information; 

(iii) a comparator that compares the at least one amino 
acid sequence information selected from SEQ ID NOS:3502 to 
7001 with the target sequence or target structure motif 
information, recorded by the data storage device for 
screening and analyzing amino acid sequence information 
which is coincident with or analogous to the target 
sequence or target structure motif information; and 

(iv) an output device that shows a screening or 
analyzing result obtained by the comparator. 

(26) A method based on a computer for identifying a 
target sequence or a target structure motif derived from a 
coryneform bacterium, comprising the following: 

(i) inputting at least one amino acid sequence 
information selected from SEQ ID NOS:3502 to 7001, and 
target sequence information or target structure motif 
information into a user input device; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one amino acid sequence 
information selected from SEQ ID NOS:3502 to 7001 with the 
target sequence or target structure motif information; and 

(iv) screening and analyzing amino acid sequence 
information which is coincident with or analogous to the 
target sequence or target structure motif information. 
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(27) A system based on a computer for determining a 
function of a polypeptide encoded by a polynucleotide 
having a taxget nucleotide sequence derived from a 
coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one 
nucleotide sequence information selected from SEQ ID NOS:2 
to 3501 , function information of a polypeptide encoded by 
the nucleotide sequence, and target nucleotide sequence 
information ; 

(ii) a data storage device for at least temporarily 
storing the input information; 

(iii) a comparator that compares the at least one 
nucleotide sequence information selected from SEQ ID NOS:2 
to 3501 with the target nucleotide sequence information, 
and de termi ni ng a f un c t i on of a polypep t i de encoded by a 
polynucleotide having the target nucleotide sequence which 
is coincident with or analogous to the polynucleotide 
having at least one nucleotide sequence selected from SEQ 
ID NOS;2 to 3501; and 

(iv) an output devices that shows a function obtained by 
the comparator . 

(28) A method based on a computer for determining a 
function of a polypeptide encoded by a polypeptide encoded 
by a polynucleotide hatving a target nucleotide sequence 
derived from a coryneform bacterium, comprising the 
following: 



(i) inputting at least one nucleotide sequence 
information selected from SEQ ID NOS:2 to 3501, function 
information of a polypeptide encoded by the nucleotide 
sequence, and target nucleotide sequence information; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one nucleotide sequence 
information selected from SEQ ID N0S:2 to 3501 with the 
target nucleotide sequence information; and 

<iv) determining a function of a polypeptide encoded by 
a polynucleotide having the target nucleotide sequence 
which is coincident with or analogous to the polynucleotide 
having at least one nucleotide sequence selected from SEQ 
ID NOS:2 to 3501. 

(29) A system based on a computer for determining a 
function of a pclypepticfe having a target amino acid 
sequence derived from a coryneform bacterium, comprising 
the following: 

(i) a user input device that inputs at least one amino 
acid sequence information selected from SEQ ID NOS:3502 to 
7001, function information based on the amino acid sequence, 
and target amino acid sequence information; 

(ii) a data storing device for at least temporarily 
storing the input information; 

(iii) a comparator that compares the at least one amino 
acid sequence information selected from SEQ ID NOS:3502 to 
7001 with the target amino acid sequence information for 
determining a function of a polypeptide having the target 
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amino acid sequence which is coincident with or analogous 
to the polypeptide having at least one amino acid sequence 
selected from SEQ ID NOS-3502 to 7001; and 

(iv) an output device that shows a function obtained by 
the comparator . 

(30) A method based on a computer for determining a 
function of a polypeptide having a target amino acid 
sequence derived from a coryneform bacterium, comprising 
the following: 

(i) inputting at least one amino acid sequence 
information selected from SEQ ID NOS:3502 to 7001, function 
information based on the amino acid sequence, and target 
amino acid sequence inf on»^tion; 

(ii) at least temporarily storing said information; 

(iii) comparing the -a- least one amino acid sequence 
information selected froifl SEQ ID NOS:3502 to /001 with the 
target amino acid sequence information; and 

<iv) determining a function of a polypeptide having the 
target amino acid sequence which is coincident with or 
analogous to the polypeptide having at least one amino acid 
sequence selected from SEQ ID NOS:3502 to 7001. 

(31) The system according to any one of (23) , (25) , (27) 
and (29) , wherein a coryneform bacterium is a microorganism 
of the genus Corynabacterzum, the genus Br e v±ba. c t eri urn , or 
the genus >ficr , obacter\ium. 



(32) The method according to any one of (24) , (26) , (28) 
and (30) , wherein a corynef orxn bacterium is a microorganism 
of" the genus Corynebacteri.um, the genus Bre ^ x J&a c ter i um f or 
the genus Micrabacteriuaz. 

(33) The system according to (31) , wherein the 
microorganism belonging, to the genus Corynebacterium is 
selected from the group consisting of Cc-ryrxeba. ct erium 
glutamicum, Coryrx&bacterivua acetoacidophllvm, 
Corynebacterium acetoglutamicum, Corynei?acterium callunae, 
Corynebacterium hercu2is, Corynebact erium lilium, 
Corynebacterium melassecola , Corvnebacterxuni 
thersnoamxnog'enes / and Corynebacterium mjuozLizg'ejies . 

(34) The nn-cho ^^.-oni.di>:g to ..*?2) , ^:n^rein the 
microorganism be Longing Co t;*e geni^ Cory^rii.'rfCueriuzn it. 
selected from trv~. gr-^:* consisting of ..'-or/nebiuteritan 
glut ami cum, Cozyne-oacterium acetoacidophilum/ 
Corynebacterium stcetocrlutamicum , Corynebacterium callunae, 
Corynebacterium hercTjiis, Corynebacterium lilium, 
Corynebacterium jcelas-secola , Corynebacterium 
thermoam 7 nogene^ , and Corynebacterium ammonia genes. 

(35) A recording medium or storage deviof which is 
readable by a computer Lr. which at least one nucleotide 
sequence information selected from SEQ ID NOS ; 1 to 3501 or 
function information based on the nucleotide sequence is 
recorded^ and is usable in the system of (2*) or (27) or 
the method of (24) or (28 > 



(36) A recording medium or storage device which is 
readable by a ^o^jc^r i:i wh.i ch at least one amino acid 
sequence information selected from SEQ ID NOS:3S02 to 7001 
or function information ba/ied on the amino acid sequence is 
recorded; and is usable in the system of (25) or (29) or 
the method of (2 6) or (30) . 

(37) The recaxJj ,\io r-^o —m cr storage device according to 
(35) or (36) , wn;v . : is a computer readable recording medium 
selected from the g^oup consisting of a floppy disc , a hard 
disc, a magnetic tape,, a random access memory ?PAM) , a read 
only memory (ROM) , a magneto-optic disc <MO) , CD - ROM , CD-R, 
CD-RW, DVD-ROM, DVO-P-Vr* aa-J DVD-RW. 

(38) A poly^^pr^de n©v lg a hon-.-- * .ne dehydrogenase 
activity, conpr:t»,i r*^ -iir i.nj.n'v acid .svtqu^nce ; which the 
Val residue at JZs*~. 5S J • in the amino ar - sequence of 
homoserine dehydrogenase de -ived froir a coryneform 
bacterium is replaced with an amino acid residue other than 
a Val residue. 

(39) A polypeptide comprising an ainino acic sequence in 
which the Val re.'ii.aue at the 59th position the amino 
acid sequence as repres s u^d by SEQ ID *JO:695? Is replaced 
with an amino acid residue other than a Val i'.*.tdue. 

(40) The polypeptide according to or : wherein 
the Val residue at th*> 59th position : 3 replaced with an 
Ala residue. 
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(41) A polypeptide having pyruvate carboxylase activity, 
comprising an amino acid sequence in which the Pro residue 
at the 458th position in the amino acid sequence of 
pyruvate carboxylase derived from a coryneform bacterium is 
replaced with an amino acid residue other than a Pro 
residue . 

(42) A polypeptide comprising an amino acid sequence in 
which the Pro residue at \.he 458th position in the amino 
acid sequence represented by SSQ ID MO: 4265 is replaced 
with an amino acid residue other than a Pro residue. 

(43) The polypeptide according to (41) or (42) , wherein 
the Pro residue at the 458th position is replaced with a 
Ser residue. 

(44) The polypfer-tide ~r* tording to any . of (38) to 
(43) , which is der . v/ed troir CGrynebacterlvan ..^aicum, 

(45) A DKA encoding the polypeptide of any one of (38) 
to (44) * 

(46) A recombinant DMA comprising the DNA s.t' (45) . 

(47) A transformant comprising the recombinant DNA of 
(46). 

(48) A transformant comprising in its chromosome the DNA 
of (45) , 

(49) The transformant according to (47) or (48) , which 
is derived from a corynefonn bacterium. 

(50) The transformant according to (4 9) , which is 
derived from CoryneJbactejrlvjn glutam±cvm. 
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(51) A method for producing L-lysine, comprising: 
culturing the transformant of any one of (47) to 

(50) in a medium to produce and accumulate L- lysine in the 
medium, and 

recovering the L-lysine from the culture. 

(52) A method for breeding a coryneform bacterium using 
the nucleotide sequence Information represented by SEQ ID 
NOS:l to 3431, comprising the following; 

(i) comparing a* nucleotide sequence of a genome or gene 
of a production strain derived a coryneform bacterium which 
has been subjected to mutation breeding so as uo produce at 
least one compound selected from an amino acid, a nucleic 
acid, a vitamin, , a 5ao.rar.-ie, an orgar : c acid, and 
analogous therer-» j v fermentation meT ;• or - with a 
corresponding nuc^r:icle ?e>$uence in SEQ ID ^C- - 1 to 3431; 

(ii) identifying a Mutation point present in the 
production strain based on a result obtained by (i) ; 

(iii) introducing the mutation point into a coryneform 
bacterium which is free of the mutation point; and 

(iv) examining productivity by the fermentation method 
of the compound selected in (i) of the coryneform bacterium 
obtained in (iii) 

(53) The method according to (52) , wherein the gene is a 
gene encoding an enzyme in a bio synthetic pathway or a 
signal transmission pathway. 



(54) The method according to (52) , wherein the mutation 
point is a mutation point relating to a useful mutation 
which improves or stabilizes the productivity. 

(55) A method for breading a coryneform bacterium using 
the nucleotide sequence information represented by SEQ ID 

4 

NOStl to 3431, comprising: 

(i) comparing a nucleotide sequence of a genome or gene 
of a production strain derived a coryneform bacterium which 
has been subjected to mutation breeding so as to produce at 
least one compound selected from an amino acid, a nucleic 
acid, a vitamin a saccharide, an organic acid, and 
analogous thereof by a fermentation method, with a 
corresponding nucleoside sequence in SEC ID NCS to 3431; 

(ii) identifying a --m ■v.ation poiit pre»«r,t in the 
production strain ba^ed o'. s result obtain bv ii) ; 

(iii) deleting a mutation point from a coryneform 
bacterium having the mutation point; and 

<iv) examining productivity by the fermentation method 
of the compound selected in (i) of the coryneform bacterium 
obtained in (iii) . 

(56) The method according to (55) , wherein the gene is a 
gene encoding an en2yme . in a biosynthetic pathway or a 
signal transmission pathway. 

(57) The method according to (55) , wherein the mutation 
point is a mutation point which decreases or destabilizes 
the productivity. 
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(58) A method for breeding a coryneform bacterium using 
the nucleotide sequence information represented by SEQ ID 
NOS:2 to 3431, comprising the following: 

(i) identifying an isozyme relating to biosynthesis of 
at least one compound selected from an amino acid, a 
nucleic acid, a vitamin, a saccharide, an organic acid, and 
analogous thereof, based on the nucleotide sequence 
information represented by SEQ ID N0S:2 to 3431; 

(ii) classifying the isozyme identified in (i) into an 
isozyme having the same activity; 

(iii) mutating all genes encoding the isozyme having the 
same activity simultaneously; and 

<iv) examining productivity by a fermentation method of 
the compound selected in (i) of the coryneform bacterium 
which have been transformed with the gene obtained in (iii) 

(59) A method for breeding a coryneform bacterium using 
the nucleotide sequence information represented by SEQ ID 
NOS:2 to 3431, comprising the following: 

(i) arranging a function information of an open reading 
frame (ORF) represented by SEQ ID NOS:2 to 3431; 

(ii) allowing the arranged ORF to correspond to an 
enzyme on a known biosynthesis or signal transmission 
pathway ; 

(iii) explicating an - unknown biosynthesis pathway or 
signal transmission pathway of a coryneform bacterium in 
combination with information relating known biosynthesis 



pathway or signal transmission pathway of a corynefform 
bacterium; 

(iv) comparing the pathway explicated in (iii) with a 
biosynthesis pathway of a target useful product; and 

(v) transgenetically varying a coryneform bacterium 
based on the nucleotide sequence information to either 
strengthen a pathway which is judged to be important in the 
biosynthesis of the target useful product in (iv) or weaken 
a pathway which is judged not to be important in the 
biosynthesis of the target useful product in (iv) . 

(60) A coryneform bacterium, bred by the method of any 
one of (52) to (59) . 

(61) The coryneform bacterium according to (60) , which 
is a microorganism belonging to the genus Coxynebacteriizm, 
the genus BrevjJbacterlum , cr the genus Microbacterium . 

(62) The coryneform bacterium according to (61) , wherein 
the microorganism belonging to the genus Cnrynehaaterium is 
selected from the group consisting of Coxynebacterium 
glutamicum, Corynebaaterlum acetoaaidophllum , 
Coryneba a t eri urn acetoglutamicum, Corynebaaterlum callunae, 
Corynebacterlum heirculis, Corynebacterlxm. l±l±um, 
CoryneJbacterium melassecola, Corynebacterlum 
fchexmoaminogenes , and Coxynebacterium arnzaoniagenes . 

(63) A method for producing at least one compound 
selected from an amino acid, a nucleic acid, a vitamin, a 
saccharide, an organic acid and an analogue thereof, 
comprising: 
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culturing a coryneform bacterium of any one of (60) 
to (62) in a medium to produce and accumulate at least one 
compound selected from an amino acid, a nucleic acid, a 
vitamin, a saccharide, an organic acid, and analogues 
thereof ,- 

recovering the compound from the culture. 

(64) The method according to (63) , wherein the compound 
is L-lysine. 

(65) A method for identifying a protein relating to 
useful mutation based on proteome analysis, comprising the 
following : 

(i) preparing 

a protein derived from a bacterium of a production 
strain of a coryneform bacterium which has been subjected 
to mutation breeding by a fermentation process so as to 
produce at least one compound selected from an amino acid, 
a nucleic acid, a vitamin, a saccharide, an organic acid, 
and analogues thereof, and 

a protein derived from a bacterium of a parent 
strain of the production strain ; 

(ii) separating the proteins prepared in (i) by two 
dimensional electrophoresis; 

(iii) detecting the separated proteins, and comparing an 
expression amount of the protein derived from the 
production strain with that derived from the parent strain; 
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(iv) treating the protein showing different expression 
amounts as a result of * the comparison with a peptidase to 
extract peptide fragments ; 

<v) analyzing amino acid sequences of the peptide 

fragments obtained in (iv) ; and 

(vi) comparing the amino acid sequences obtained in (v) 
with the amino acid sequence represented by SEQ ID NOS:3502 
to 7001 to identifying the protein having the amino acid 
sequences - 

As used herein , the term "proteome", which is a 
coined word by combining "protein" with "genome", refers to 
a method for examining of a gene at the polypeptide level . 

(66) The method according to (65) , wherein the 
coryneform bacterium is a microorganism belonging to the 
genus Corynebacteri um , the genus Brevibacterium, or the 
genus Ml croba. c t er i um . 

(67) The method according to (66) , wherein the 
microorganism belonging to the genus Corynebacterium is 
selected from the group consisting of Corynebac teriuin 
glutamlcum, Corynebacterium acetoacidopbilum, 
Corynetbacterl urn acetoglutamicum, Coxrynebacterlum callunae, 
Corynebactexiua herculis, Corynebacterimn lilium, 
Corynebacteri una melassecola, Corynebacterium 
thermoajniiiogenes , and Corynebacterlum ammon±agenes . 

(68) A biologically pure culture of Corynebact^rlvmi 
g-lu tarn i cum AHP-3 (FEPM BP-7382) . 
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The present: invention will be described below in 
more detail , based on the determination of the full 
nucleotide sequence of coryneform bacteria . 



1. Determination of full nucleotide sequence of coryneform 
bacteria 

The term "coryneform bacteria" as used herein means 
a microorganism belonging to the genus Corynebacterivm , the 
genus Brevlbacterivm or the genus Microbacterium as defined 
in Bergeys Manual of Determinative Bacteriology, 8: 599 
(1974) , 

Examples include Corynebacterium acetoaoidophilum, 
Corynebacterivon acetocrlutamlcwn, Corynebacterium callunae, 
Corynebacterium glutamicum, Corynebacterium herculls , 
Coryn eba c t era urn lilium r Coryneba c terl um melassecola t 
Corynebacterium theiao^m i nogenes , Brev ibacter Ivan 

saccharolyticum , Brevlbacterium .. Immariophllvm , 

Brev±bacterium roseum, Brevlbacterium thlogeni talis , 
Microbacterium antmoniaphilvtm , and the like. 

Specific examples include Corynebacterium 
acetoacidophilum ATCC 13870, Corynebacterium 

acetoglutamicum ATCC 15806, Coryneba cterium callunae ATCC 
15991, Corynebacterlum glut ami cum ATCC 13032, 

Corynebacterium glutamicum ATCC 13060 , Corynebacterium 
glutamicum ATCC 13826 (prior genus and species: 
Breviba c t erx um flawm, or Coryneba cteri um lactof ennentuzn) , 
Corynebacter-ium glutamacum ATCC 14020 (prior genus and 
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species: Brevlbacterlum divaricatvm) , Corynebacterivm 
Cflutamicum ATCC 13869 (prior genus and species: 
Brevibacterium lac toJferrnen turn) , Corjnebacterium he:rculis 
ATCC 13868, Co.rynebacfcex*iunz liliuzn ATCC 15990, 
Co:rynebactex-ium melassecola ATCC 17965, Corynebacterlxmi 
thermoaminocrenes FERM 9244, BrexrUbacterium saccha^rolvticura 
ATCC 14066, Brevibacterium immariophilum ATCC 14068, 
Brevibacterium roseuzn ATCC 13825, Bre^Lbacterium 
thiog-eni talis ATCC 19240, Wicjrobacterium ammoniaphilum ATCC 
15354, and the like. 

(1) Preparation of genome DNA of corynefcrm bacteria 

Coryneform bacteria can be cultured by a 
conventional method. 

Any of a natural medium and a synthetic medium can 
be used, so long as it is a medium suitable for efficient 
culturing of the microorganism, and it contains a carbon 
source, a nitrogen source, an inorganic salt, and the like 
which can be assimilated by the microorganism. 

In Corynebacterium glutamicum, for example, a BY 
medium (7 g/1 meat extract, 10 g/1 peptone, 3 g/1 sodium 
chloride, 5 g/1 yeast extract, pH 7.2) containing 1% of 
glycine and the like can be used. The culturing is carried 
out at 25 to 35°C overnight. 

After the completion of the culture, the cells are 
recovered from the culture by centrif ugation . The 
resulting cells are washed with a washing solution. 
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Examples of the washing solution include STE buffer 
(10.3% sucrose, 25 ramol/1 Tris hydrochloride, 25 mmol/1 
ethylenediaminetetraacetic acid (hereinafter referred to as 
"EDTA") , pH 9.0), and the like. 

Genome DNA can be obtained from the washed cells 
according to a conventional method for obtaining genome DNA, 
namely, lysing the cell wall of the cells using a lysozyme 
and a surfactant (SDS, etc.), eliminating proteins and the 
like using a phenol solution and a phenol /chloroform 
solution, and then precipitating the genome DNA with 
ethanol or the like. Specifically, the following method 
can be illustrated. 

The washed cells are suspended in a washing 
solution containing 5 to 20 mg/1 lysozyme. After shaking, 
5 to 20% SDS is added to lyse the cells. In usual, shaking 
is gently performed at 25 to 40°C for 30 minutes to 2 hours. 
After shaking, the suspension is maintained at 60 to 70°C 
for 5 to 15 minutes for the lysis. 

After the lysis, the suspension is cooled to 
ordinary temperature, and 5 to 20 ml of Tris-neutralized 
phenol is added thereto, followed by gently shaking at room 
temperature for 15 to 45 minutes. 

After shaking, centrifugation (IS ,000 x g, 20 
minutes, 20°c) is carried out to fractionate the a<*ueous 
layer. 

After performing extraction with phenol /chloroform 
and extraction with chloroform (twice) in the same manner, 



3 mol/1 sodium acetate solution (pH 5.2) and isopropanol 
are added to the aqueous layer at 1/10 times volume and 2 
times volume, of the aqueous layer, respectively, followed 
by gently stirring to precipxtate the genome DNA. 

The genome DNA is dissolved again in a buffer 
containing 0.01 to 0.04 mg/ml RNase . As an example of the 
buffer, TE buffer (10 mmol/1 Tris hydrochloride, 1 mol/1 
EDTA, pH 8.0) can be used. After dissolving, the resultant 
solution is maintained at 25 to 40°C for 20 to 50 minutes 
and then extracted successively with phenol, 
phenol /chloroform and chloroform as in the above case. 

After the extraction, isopropanol precipitation is 
carried out and the resulting DNA precipitate is washed 
with 70% ethanol, followed by air drying, and then 
dissolved in TE buffer to obtain a genome DNA solution. 

<2) Production of shotgun library 

A method for produce a genome DNA library using the 
genome DNA of the coryneform bacteria prepared in the above 

(1) include a method described in Molecular Cloning, A 
laboratory Manual, Second Edition (1989) (hereinafter 
referred to as "Molecular Cloning, 2nd ed."). In 
particular, the following method can be exemplified to 
prepare a genome DNA library appropriately usable in 
determining the full nucleotide sequence by the shotgun 
method. - 
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To 0.01 mg of the genome DNA of the coryne form 
bacteria prepared in the above (1), a buffer, such as TE 
buffer or the like, is added to give a total volume of 0.4 
ml. Then, the genome DNA is digested into fragments of 1 
to 10 kb with a sonicator (Yamato Powersonic Model 50) . 
The treatment with the sonicator is performed at an output 
of 20 continuously for 5 seconds . 

The resulting genome DNA fragments are blunt-ended 
using DNA blunting kit (manufactured by Takara Shuzo) or 
the like. 

The blunt-ended genome fragments are fractionated 
by agarose gel or polya cry 1 amide gel electrophoresis and 
genome fragments of 1 to 2 kb are cut out from the gel. 

To the gel, 0.2 -to 0.5 ml of a buffer for eluting 
DNA, such as MG elution buffer (0.5 mol/1 ammonium acetate, 
10 mmol/1 magnesium acetate, 1 mmol/1 EDTA, 0.1% SDS) or 
the like, is added, followed by shaking at 25 to 40°C 
overnight to elute DNA. 

The resulting DNA eluate is treated with 
phenol/chlorof oim and then precipitated with ethanol to 
obtain a genome library insert. 

This insert is ligated into a suitable vector, such 
as pUCie Soal/BAP (manufactured by Amersham Pharmacia 
Biotech) or the like, . using T4 ligase (manufactured by 
Takara Shuzo) or the like; The ligation can be carried out 
by allowing a mixture to stand at 10 to 20°C for 20 to 50 
hours . 
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The resulting ligation product is precipitated with 
ethanol and dissolved in 5 to 20 of TE buffer. 

Escherichia coll is transformed in accordance with 
a conventional method using 0.5 to 2 pi of the ligation 
solution. Examples of the transformation method include 
the electroporation method using ELECTRO MAX DH10B 
(manufactured by Life Technologies) for Escherichia coll. 
The electroporation method can be carried out under the 
conditions as described in the manufacturer's instructions. 

The transformed Escherichia, coli is spread on a 
suitable selection medium containing agar, for example, LB 
plate medium containing 10 to 100 mg/1 ampicillin (LB 
medium (10 g/1 bactotrypton, 5 g/1 yeast extract, 10 g/1 
sodium chloride, pH 7.0) containing 1.6% of agar) when 
pUC18 is used as the cloning vector, and cultured therein. 

The transformant can be obtained as colonies formed 
on the plate medium. In this step, it is possible to 
select the transformant having the recombinant DNA 
containing the genome DMA as white colonies by adding x-gal 
and IPTG (isopropyl-0-thiogalactopyranoside) to the plate 
medium. 

The transformant is allowed to stand for culturing 
in a 96-well titer plate ' to which 0.05 ml of the LB medium 
containing 0.1 mg/ml of ampicillin has been added in each 
well. The resulting culture can be used in an experiment 
of (4) described below. Also, the culture solution can be 
stored at -80°C by adding 0.05 ml per well of the LB medium 
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containing 20% glycerol to the culture solution, followed 
by mixing, and the stored culture solution can be used at 
any time . 

(3) Production of cosmid library 

The genome DNA (0.1 mg) of the coryneform bacteria 
prepared in the above (1) is partially digested with a 
restriction enzyme, such as Sau3AI or the like, and then 
uitracentrifuged (26,000 rpm, 18 hours, 20°C) under a 10 to 
40% sucrose density gradient using a 10% sucrose buffer (1 
mol/1 NaCl, 20 ramol/1 Tris hydrochloride, 5 mmol/1 EDTA, 
10% sucrose, pH 8.0) and a 40% sucrose buffer (elevating 
the concentration of the 10% sucrose buffer to 40%) . 

After the centrif ugation, the thus separated 
solution is fractionated into tubes in 1 ml per each tube . 
After confirming the DNA fragment size of each fraction by 
agarose gel electrophoresis, a fraction rich in DNA 
fragments of about 4 0 Jcb is precipitated with ethanol. 

The resulting DNA fragment is ligated to a cosmid 
vector having a cohesive end which can be ligated to the 
fragment. When the genome DNA is partially digested with 
Sau3AI, the partially digested product can be ligated to, 
for example, the BamHI site of superCosl (manufactured by 
Stratagene) in accordance with the manufacture's 
instructions . 

The resulting ligation product is packaged using a 
packaging extract which can be prepared by a method 
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described in Molecular Cloning, 2nd ed. and then used in 
transforming Escherichia coll. More specifically, the 
ligation product is packaged using, for example, a 
commercially available packaging extract, Gigapack III Gold 
Packaging Extract (manufactured by Stratagene) in 
accordance with the manufacture's instructions and then 
introduced into Escherichia, coll XL-l-BlueMR (manufactured 
by Stratagene) or the like. 

The thus transformed £*>che:richia coll is spread on 
an LB plate medium containing ampicillin, and cultured 
therein . 

The transformant can be obtained as colonies formed 
on the plate medium. 

The transformant is subjected to sranding culture 
in a 96-well titer plate to which 0.05 ml of the LB medium 
containing 0.1 mg/ml ampicillin has been added. 

The resulting culture can be employed in an 
experiment of (4) described below. Also, the culture 
solution can be stored at -80°C by adding 0.05 ml per well 
of the LB medium containing 20% glycerol to the culture 
solution, followed by mixing, and the stored culture 
solution can be used at any time. 
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(4) Determination of nucleotide sequence 

(4-1) Preparation of template 

The full nucleotide sequence of genome DNA of 
coryneform bacteria can be determined basically according 
to the whole genome shotgun method (Science, 269: 4 96-512 

(1995)). 

The template used in the whole genome shotgun 
method can be prepared by PCR using the library prepared in 
the above (2) (DNA Research, 5: 1-9 (1998)). 

Specifically, the template can be prepared as 

follows . 

The clone derived from the whole genome shotgun 
library is inoculated by using a replicator (manufactured 
by GENET IX) into each well of a 96-weli plate to which 0,08 
ml per well of the LB medium containing 0 . 1 mg/ml 
ampicillin has been added, followed by stationarily 
culturing at 37°C overnight. 

Next, the culture solution is transported, using a 
copy plate (manufactured by Tokken) , into each well of a 
96-well reaction plate (manufactured by PE Biosystems) to 
which 0.025 ml per well of a PCR reaction solution has been 
added using TaKaRa Ex Taq (manufactured by Takara Shuzo) . 
Then, PCR is carried out in accordance with the protocol by 
Makino et al. (DMA Research, 5: 1-9 (1998)) using GeneAmp 
PCR System 9700 (manufactured by PE Biosystems) to amplify 
the inserted fragments . 
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The excessive primers and nucleotides are 
eliminated using a kit for purifying a PCR product, and the 
product is used as the template in the sequencing reaction. 

It is also possible to determine the nucleotide 
sequence using a double- stranded DNA plasmid as a template. 

The double-stranded DNA plasmid used as the 
template can be obtained by the following method. 

The clone derived from the whole genome shotgun 
library is inoculated into each well of a 24- or 96-well 
plate to which 1.5 ml per well of a 2 x YT medium (16 g/1 
bactotrypton , 10 g/1 yeast extract, 5 g/1 sodium chloride, 
pH 7.0) containing 0.05 mg/ml ampicillin has been added, 
followed by culturing under shaking at 37 C C overnight. 

The double- stranded DNA plasmid can be prepared 
from the culture solution using an automatic plasmid 
preparing machine KURABO PI-50 (manufactured by Kurabo 
Industries) , a multiscreen (manufactured by Millipore) or 
the like, according to each protocol. 

To purify the plasmid, Biomek 2000 manufactured by 
Beckman Coulter and the like can be used. 

The resulting purified double -stranded DNA plasmid 
is dissolved in water to give a concentration of about 0.1 
mg/ml. Then , it can be used as the template in sequencing. 
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(4-2) Se<juencing reaction 

The sequencing reaction can be carried out 
according to a commercially available sequence kit or the 
like- A specific method is exemplified below. 

To 6 ul of a solution of ABI PRISM BigDye 
Terminator Cycle Secjuencing Ready Reaction Kit 
(manufactured by PE Biosystems) , 1 to 2 pmol of an M13 
regular direction primer (M13-21) or an M13 reverse 
direction primer (M13REV) {DNA Research, 5: 1-9 (1998)) and 
50 to 200 ng of the template prepared in the above (4-1) 
(the PGR product or plasmid) to give 10 ul of a secjuencing 
reaction solution. 

A dye terminator sequencing reaction (35 to 55 
cycles) is carried out using this reaction solution and 
GeneAmp PGR System 9700 (manufactured by PE Biosystems) or 
the like. The cycle parameter can be determined in 
accordance with a commercially available kit, for example, 
the manufacture's instructions attached with ABI PRISM Big 
Dye Terminator Cycle Se<iuencing Ready Reaction Kit. 

The sample can " be purified using a commercially 
available product, such as Multi Screen HV plate 
(manufactured by Millipore) or the like, according to the 
manuf a c tur e * s i ns t ructi on s . 

The thus purified reaction product is precipitated 
with ethanol, dried and then used for the analysis. The 
dried reaction product can be stored in the dark at -30°C 
and the stored reaction product can be used at any time. 
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The dried reaction product can be analyzed using a 
commercially available sequencer and an analyzer according 
to the manufacture's instructions. 

Examples of the commercially available sequencer 
include ABI PRISM 377 DNA Sequencer (manufactured by PE 
Biosystems) . Example of the analyzer include ABI PRISM 
3700 DNA Analyzer (manufactured by PE Biosystems) . 

(5) Assembly 

A software, such as phred (The University of 
Washington) or the like, can be used as base call for use 
in analyzing the sequence information obtained in the above 
(4) . A software , such as Cross. Match (The University of 
Washington) or SPS Cross_Match (manufactured by Southwest 
Parallel Software) or the like, can be used to mask the 
vector sequence information . 

For the assembly, a software, such as phrap (The 
University of Washington) , SPS phrap (manufactured by 
Southwest Parallel Software) or the like, can be used. 

In the above , analysis and output of the results 
thereof, a computer such as UNIX, PC, Macintosh, and the 
like can be used. 

Contig obtained by the assembly can be analyzed 
using a graphical editor such as consed (The University of 
Washington) or the like. 
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It is also possible to perforin a series of the 
operations from the base call to the assembly in a lump 
using a script phredPhrap attached to the consed. 

As used herein, software will be understood to also 
be referred to as a comparator. 

(6) Determination of nucleotide sequence in gap part 

Each of the cosmids in the cosmid library 
constructed in the above (3) is prepared in the same manner 
as in the preparation of the double- stranded DNA plasmid 
described in the above (4-1) . The nucleotide sequence at 
the end of the insert fragment of the cosmid is determined 
using a commercially available kit, such as ABI PRISM 
BigDye Terminator Cycle Sequencing Ready Reaction Kit 

(manufactured by PE Biosystems) according to the 
manufacture 1 s instructions . 

About 800 cosmid clones are sequenced at both ends 
of the inserted fragment to detect a nucleotide sequence in 
the contig derived from the shotgun sequencing obtained in 

(5) which is coincident with the sequence. Thus, the chain 
linkage between respective cosmid clones and respective 
con tigs are clarified , and mutual alignment is carried out. 
Furthermore, the results are compared with known physical 
maps to map the cosmids and the con tigs - In case of 
Corynebacterlum glutamicum ATCC 13032, a physical map of 
Mol. Gen- Genet., 252: 255-265 (1996) can be used. 
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The sequence in the region which cannot be covered 
with the con tigs (gap part) can be determined by the 
following method. 

Clones containing sequences positioned at the ends 
of the con tigs are selected. Among these , a clone wherein 
only one end of the inserted fragment has been determined 
is selected and the sequence at the opposite end of the 
inserted fragment is determined. 

A shotgun library clone or a cosmid clone derived 
therefrom containing the sequences at the respective ends 
of the inserted fragments in the two contigs is identified 
and the full nucleotide sequence of the inserted fragment 
of the clone is determined. 

According to this method, the nucleotide sequence 
of the gap part can be determined. 

When no shotgun library clone or cosmid clone 
covering the gap part is • available, primers complementary 
to the end sequences of the two different contigs are 
prepared and the DNA fragment in the gap part is amplified . 
Then # sequencing is performed by the primer walking method 
using the amplified DNA fragment as a template or by the 
shotgun method in which the sequence of a shotgun clone 
prepared from the amplified DNA fragment is determined. 
Thus, the nucleotide sequence of the above -de scribed region 
can be determined. 

In a region showing a low sequence accuracy, 
primers are synthesized using AUTOFINISH function and 



NAVIGATING function of consed (The University of 
Washington) , and the sequence is determined by the primer 
walking method to improve the sequence accuracy. 

Examples of the thus determined nucleotide sequence 
of the full genome inclyde the full nucleotide sequence of 
genome of Corynebact.ei:lum glut-ami cum ATCC 13032 represented 
by SEQ ID NO:l. 

(7) Determination of nucleotide sequence of microorganism 
genome DNA using the nucleotide sequence represented by SEQ 
ID NO:l 

A nucleotide sequence of a polynucleotide having a 
homology of 80% or more with the full nucleotide sequence 
of Corynebacterlum glutamxcum ATCC 13032 represented by SEQ 
ID NO:l as determined above can also be determined using 
the nucleotide sequence represented by SEQ ID NO: 1 / and the 
polynucleotide having t a nucleotide sequence having a 
homology of 80% or more with the nucleotide sequence 
represented by SEQ ID NO:l of the present invention is 
within the scope of the present invention. The term 
"polynucleotide having a nucleotide sequence having a 
homology of 80% or more with the nucleotide sequence 
represented by SEQ ID NO:l of the present invention" is a 
polynucleotide in which a full nucleotide sequence of the 
chromosome DNA' can be determined using as a primer an 
oligonucleotide composed of continuous 5 to 50 nucleotides 
in the nucleotide sequence * represented by SEQ ID NO:l, for 
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example, according to PCR using the chromosome DMA as a 
template- A particularly preferred primer in determination 
of the full nucleotide sequence is an oligonucleotide 
having nucleotide sequences which are positioned at the 
interval of about 300 to 500 bp, and among such 
oligonucleotides, an oligonucleotide having a nucleotide 
sequence selected from DNAs encoding a protein relating to 
a main metabolic pathway is particularly preferred. The 
polynucleotide in which the full nucleotide sequence of the 
chromosome DNA can be determined using the oligonucleotide 
includes polynucleotides constituting a chromosome DNA 
derived from a microorganism belonging to coryneform 
bacteria. Such a polynucleotide is preferably a 

polynucleotide constituting chromosome DNA derived from a 
microorganism belonging to the genus Cdrynebacter\ium, more 
preferably a polynucleotide constituting a chromosome DNA 
of CozrynGhacteriiim cflutamiaum* 



- 42 - 



2 . Identification of ORF (open reading frame) and 
expression regulatory fragment and determination of the 
function of ORF 

Based on the full nucleotide sequence data of the 
genome derived from coryneform bacteria determined in the 
above item 1 , an ORF and an expression modulating fragment 
can be identified. Furthermore, the function of the thus 
determined ORF can be determined. 
□ The ORF means a continuous region in the nucleotide 

*% s sequence of mRNA which can be translated as an amino acid 

m sequence to mature to a protein. A region of the DNA 

sy coding for the ORF of mRNA is also called ORF. 

I*' The expression modulating fragment (hereinafter 

referred to as "EMF") is used herein to define a series of 
!JJ polynucleotide fragments which modulate the expression of 

i the ORF or another sequence ligated opera tably thereto. 

: CSX 

The expression "modulate the expression of a sequence 
ligated operatably" is used herein to refer to changes in 
the expression of a sequence due to the presence of the EMF. 
Examples of the EMF include a promoter, an operator, an 
enhancer, a silencer, a ribosome-binding sequence, a 
transcriptional termination sequence, and the like. In 
coryneform bacteria, an EMF is usually present in an 
intergenic segment (a fragment positioned between two 
genes; about 10 to 200 nucleotides in length) . Accordingly, 
an EMF is frequently present in an intergenic segment of 10 
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nucleotides or longer. It is also possible to determine or 
discover the presence of an EMF by using known EMF 
sequences as a target sequence or a target structural motif 
(or a target motif) using an appropriate software or 
comparator, such as FASTA (Proc. Natl. Acad. Scl. USA, 
85: 2444-48 (1988)), BLAST (J". Mol. Biol., 215: 403-410 
(1990)) or the like. Also, it can be identified and 
evaluated using a known EMF-capturing vector (for example, 
pKK232-8; manufactured by Amersham Pharmacia Biotech). 

The term "target sequence" is used herein to refer 
to a nucleotide sequence composed of 6 or more nucleotides, 
an amino acid sequence composed of 2 or more amino acids , 
or a nucleotide sequence encoding this amino acid sequence 
composed of 2 or more amino acids . A longer target 
sequence appears at random in a data base at the lower 
possibility. The target sequence is preferably about 10 to 
100 amino acid residues or about 30 to 300 nucleotide 
residues . 

The term "target structural motif" or "target 
motif" is used herein to refer to a sequence or a 
combination of sequences selected optionally and reasonably 
Such a motif is selected on the basis of the three- 
dimensional structure formed by the folding of a 
polypeptide by means known to one of ordinary skill in the 
art. Various motives are known. 



Examples of the target motif of a polypeptide 
include, but are not limited to, an enzyme activity site, a 
protein-protein interaction site, a signal sequence, and 
the like. Examples of the target motif of a nucleic acid 
include a promoter sequence, a transcriptional regulatory 
factor binding sequence, a hair pin structure, and the like. 

Examples of highly useful EMF include a high- 
expression promoter, an inducible-expression promoter, and 
the like. Such an EMF can be obtained by positionally 
determining the nucleotide sequence of a gene which is 
known or expected as achieving high expression (for example, 
ribosomal RNA gene: GenBank Accession No. M16175 or Z4 6753) 
or a gene showing a desired induction pattern (for example, 
isocitrate lyase gene induced by acetic acid: Japanese 
Published Unexamined Patent Application No. 56782/93) via 
the alignment with the full genome nucleotide sequence 
determined in the above item 1, and isolating the genome 
fragment in the upstream part (usually 200 to 500 
nucleotides from the translation initiation site) . It is 
also possible to obtain a highly useful EMF by selecting an 
EMF showing a high expression efficiency or a desired 
induction pattern from among promoters captured by the EMF- 
capturing vector as described above . 

The ORF can be identified by extracting 
characteristics common to individual ORFs , constructing a 
general model based on these characteristics, and measuring 



the conformity of the subject sequence with the model. In 
the identification , a software, such as GeneMark (Nuc. 
Acids. Res., 22: 4756-67 (1994): manufactured by GenePro) ) , 
GeneMark. hmm (manufactured by GenePro), GeneHacker (Protein, 
Nucleic Acid and Enzyme, 42: 3001-07 (1997)), Glimmer (Nuc. 
Acids. Res. , 26: 544-548 (1998) : manufactured by The 
Institute of Genomic Research) , or the like, can be used. 
In using the software, the default (initial setting) 
parameters are usually used, though the parameters can be 
optionally changed . 

In the above -described comparisons, a computer, 
such as UNIX, PC, Macintosh, or the like, can be used. 

Examples of the ORF determined by the method of the 
present invention include ORFs having the nucleotide 
sequences represented by SEQ ID NOS : 2 to 3501 present in 
the genome of Corynehacterlum glutamlcum as represented by 
SEQ ID NO:l. In these ORFs, polypeptides having the amino 
acid sequences represented by SEQ ID NOS: 3502 to 7001 are 
encoded. 

The function of an ORF can be determined by 
comparing the identified amino acid sequence of the ORF 
with known homologous sequences using a homology searching 
software or comparator, such as BLAST, FAST, Smith & 
Waterman (Meth. Enzym. , 164: 765 (1988)) or the like on an 
amino acid data base, such as Swith-Prot, PIR, GenBank-nr- 



aa, GenPept constituted by protein-encoding domains derived 
from GenBank data base, OWL or the like. 

Furthermore, by the homology searching, the 
identity and similarity with the amino acid sequences of 
known proteins can also be analyzed. 

With respect of the term "identity" used herein, 
where two polypeptides each having 10 amino acids are 
different in the positions of 3 amino acids, these 
polypeptides have an identity of 70% with each other. In 
case wherein one of the different 3 amino acids is analogue 
(for example, leucine and isoleucine) , these polypeptides 
have a similarity of 80%. 

As a specific example, Table 1 shows the 
registration numbers in known data bases of sequences which 
are judged as having the highest similarity with the 
nucleotide sequence of the ORF derived from Cozrynebacterlum 
glutamxcum ATCC 13032, genes of these sequences, functions 
of these genes, and identities thereof compared with known 
amino acid translation sequences. 

Thus , a great number of novel genes derived from 
coryneform bacteria can be identified by determining the 
full nucleotide sequence of the genome derived from 
coryneform bacterium by the means of the present invention. 
Moreover, the function of the proteins encoded by these 
genes can be determined. Since coryneform bacteria are 



industrially highly useful microorganisms, many of the 
identified genes are industrially useful . 

Moreover, the characteristics of respective 
microorganisms can be clarified by classifying the 
functions thus determined. As a result, valuable 

information in breeding is obtained. 

Furthermore, from the ORF information derived from 
coryneform bacteria, the ORF corresponding to the 
microorganism is prepared and obtained according to the 
-J general method as disclosed in Molecular Cloning, 2nd ed. 

IB or the like. Specifically, an oligonucleotide having a 

!U nucleotide sequence adjacent to the ORF is synthesized, and 

-J s 

i the ORF can be isolated and obtained using the 

|y oligonucleotide as a primer and a chromosome DNA derived 

IB from coryneform bacteria as a template according to the 

'5 general PCR cloning technique. Thus obtained ORF sequences 

include polynucleotides comprising the nucleotide sequence 

represented by any one of SEQ ID NOS : 2 to 3501. 

The ORF or primer can be prepared using a 

polypeptide synthesizer based on the above sequence 

information . 

Examples of the polynucleotide of the present 
invention include a polynucleotide containing the 
nucleotide sequence of the ORF obtained in the above, and a 
polynucleotide which hybridizes with the polynucleotide 
under stringent conditions. 
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The polynucleotide of the present invention can be 
a single- stranded DNA, a double- stranded DNA and a single- 
stranded RNA, though it is not limited thereto. 

The polynucleotide which hybridizes with the 
polynucleotide containing the nucleotide sequence of the 
ORF obtained in the above under stringent conditions 
includes a degenerated mutant of the ORF. A degenerated 
mutant is a polynucleotide fragment having a nucleotide 
sequence which is different from the sequence of the ORF of 
the present invention which encodes the same amino acid 
sequence by degeneracy of a gene code. 

Specific examples include a polynucleotide 
comprising the nucleotide sequence represented by any one 
of SEQ ID NOS:2 to 3431, and a polynucleotide which 
hybridizes with the polynucleotide under stringent 
conditions. 

A polynucleotide which hybridizes under stringent 
conditions is a polynucleotide obtained by colony 
hybridization, plaque hybridization, Southern blot 
hybridization or the like using, as a probe, the 
polynucleotide having the nucleotide sequence of the ORF 
identified in the above. Specific examples include a 
polynucleotide which can be identified by carrying out 
hybridization at 65°C in the presence of 0.7-1.0 M NaCl 
using a filter on which a polynucleotide prepared from 
colonies or plaques is immobilized, and then washing the 
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filter with O.lx to 2x SSC solution (the composition of Ix 
SSC contains 150 mM sodium chloride and 15 mM sodium 
citrate) at 65°C. 

The hybridization can be carried out in accordance 
with known methods described in, for example, Molecular 
Cloning, 2nd ed. , Current Protocols in Molecular Biology, 
DNA Cloning 1: Core Techniques, A Practical Approach, 
Second Edition, Oxford University (1995) or the like. 
O Specific examples of the polynucleotide which can be 

%j hybridized include a DNA having a homology of 60% or more, 

{jQ preferably 80% or more, and particularly preferably 95% or 

iy more, with the nucleotide sequence represented by any one 

in 

of SEQ ID NO: 2 to 3431 when calculated using default 
m (initial setting) parameters of a homology searching 

ijl software, such as BLAST , FASTA, Smith -Waterman or the like. 

!: Also, the polynucleotide of the present invention 

includes a polynucleotide encoding a polypeptide comprising 
the amino acid sequence represented by any one of SEQ ID 
NOS:3502 to 6931 and a polynucleotide which hybridizes with 
the polynucleotide under stringent conditions. 

Furthermore, the polynucleotide of the present 
invention includes a polynucleotide which is present in the 
5 1 upstream or 3' downstream region of a polynucleotide 
comprising the nucleotide sequence of any one of SEQ ID 
NOS:2 to 3431 in a polynucleotide comprising the nucleotide 
sequence represented by SEQ ID NO:l, and has an activity of 



- 50 - 



regulating an expression of a polypeptide encoded by the 
polynucleotide. Specific examples of the polynucleotide 
having an activity of regulating an expression of a 
polypeptide encoded by the polynucleotide includes a 
polynucleotide encoding the above described EMF, such as a 
promoter, an operator, an enhancer, a silencer, a ribosome- 
binding sequence, a transcriptional termination sequence, 
and the like. 

The primer used for obtaining the ORF according to 
the above PCR cloning technique includes an oligonucleotide 
comprising a sequence which is the same as a sequence of 10 
to 200 continuous nucleotides in the nucleotide sequence of 
the ORF and an adjacent region or an oligonucleotide 
comprising a sequence which is complementary to the 
oligonucleotide. Specific examples include an 

oligonucleotide comprising a sequence which is the same as 
a sequence of 10 to 200 continuous nucleotides of the 
nucleotide sequence represented by any one of SEQ ID NOS : 1 
to 3431, and an oligonucleotide comprising a sequence 
complementary to the oligonucleotide comprising a sequence 
of at least 10 to 20 continuous nucleotide of any one of 
SEQ ID NOS:l to 3431. When the primers are used as a sense 
primer and an an ti sense primer, the above-described 
oligonucleotides in which melting temperature (T m ) and the 
number of nucleotides are not significantly different from 
each other are preferred. 



The oligonucleotide of the present invention 
includes an oligonucleotide comprising a sequence which is 
the same as 10 to 200 continuous nucleotides of the 
nucleotide sequence represented by any one of SEQ ID NOS:l 
to 3431 or an oligonucleotide comprising a sequence 
complementary to the oligonucleotide. 

Also, analogues of these oligonucleotides 
(hereinafter also referred to as "analogous 
oligonucleotides") are also provided by the present 
invention and are useful in the methods described herein. 

Examples of the analogous oligonucleotides include 
analogous oligonucleotides in which a phosphodi ester bond 
in an oligonucleotide is converted to a phosphorothioate 
bond, analogous oligonucleotides in which a phosphodiester 
bond in an oligonucleotide is converted to an N3 ' -P5 ' 
phosphoamidate bond, analogous oligonucleotides in which 
ribose and a phosphodiester bond in an oligonucleotide is 
converted to a peptide nucleic acid bond, analogous 
oligonucleotides in which uracil in an oligonucleotide is 
replaced with C-5 propynyluracil , analogous 

oligonucleotides in which uracil in an oligonucleotide is 
replaced with C-5 thiazoluracil , analogous oligonucleotides 
in which cytosine in an oligonucleotide is replaced with 
C-5 propynyl cytosine , analogous oligonucleotides in which 
cytosine in an oligonucleotide is replaced with 
phenoxazine-modif ied cytosine, analogous oligonucleotides 
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in which ribose in an oligonucleotide is replaced with 
2 1 -O-propyl ribose, analogous oligonucleotides in which 
ribose in an oligonucleotide is replaced with 
2 ' -me thoxyethoxy ribose, and the like (Cell Engineering, 
16: 1463 (1997) ) . 

The above oligonucleotides and analogous 
oligonucleotides of the present invention can be used as 
probes for hybridization and antisense nucleic acids 
described below in addition to as primers. 

Examples of a primer for the antisense nucleic acid 
techniques known in the art include an oligonucleotide 
which hybridizes the oligonucleotide of the present 
invention under stringent conditions and has an activity 
regulating expression of the polypeptide encoded by the 
polynucleotide, in addition to the above oligonucleotide. 

3. Determination of isozymes 

Many mutants of corynef orm bacteria which are 
useful in the production of useful substances, such as 
amino acids, nucleic acids, vitamins, saccharides, organic 
acids, and the like, are obtained by the present invention. 

However, since the gene sequence data of the 
microorganism has been, to date, insufficient, useful 
mutants have been obtained by mutagenic techniques using a 
mutagen, such as ni trosoguanidine (NTG) or the like. 
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Although genes can be mutated randomly by the 
mutagenic method using the above-described mutagen, all 
genes encoding respective isozymes having similar 
properties relating to the metabolism of intermediates 
cannot be mutated. In the mutagenic method using a mutagen, 
genes are mutated randomly. Accordingly, harmful mutations 
worsening culture characteristics, such as delay in growth, 
accelerated foaming, and the like, might be imparted at a 
great frequency, in a random manner. 

However, if gene sequence information is available, 
such as is provided by the present invention, it is 
possible to mutate all of the genes encoding target 
isozymes. In this case, harmful mutations may be avoided 
and the target mutation can be incorporated. 

Namely, an accurate number and sequence information 
of the target isozymes in coryneform bacteria can be 
obtained based on the ORF data obtained in the above item 2. 
By using the sequence information, all of the target 
isozyme genes can be mutated into genes having the desired 
properties by, for example, the site-specific mutagenesis 
method described in Molecular Cloning, 2nd ed. to obtain 
useful mutants having elevated productivity of useful 
substances . 
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4 . Clarification or determination of biosynthesis pathway 
and signal transmission pathway 

Attempts have been made to elucidate biosynthesis 
pathways and signal transmission pathways in a number of 
organisms, and many findings have been reported. However, 
there are many unknown aspects of coryneform bacteria since 
a number of genes have not been identified so far. 

These unknown points can be clarified by the 
following method. 

The functional information of ORF derived from 
coryneform bacteria as identified by the method of above 
item 2 is arranged. The term "arranged" means that the ORF 
is classified based on the biosynthesis pathway of a 
substance or the signal transmission pathway to which the 
ORF belongs using known information according to the 
functional information. Next, the arranged ORF sequence 
information is compared with enzymes on the biosynthesis 
pathways or signal transmission pathways of other known 
organisms. The resulting information is combined with 
known data on coryneform bacteria. Thus, the biosynthesis 
pathways and signal transmission pathways in coryneform 
bacteria, which have been unknown so far, can be determined 

As a result that these pathways which have been 
unknown or unclear hitherto are clarified, a useful mutant 
for producing a target useful substance can be efficiently 
obtained . 
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When the thus clarified pathway is judged as 
important in the synthesis of a useful product, a useful 
mutant can be obtained by selecting a mutant wherein this 
pathway has been strengthened. Also, when the thus 
clarified pathway is judged as not important in the 
biosynthesis of the target useful product, a useful mutant 
can be obtained by selecting a mutant wherein the 
utilization frequency of this pathway is lowered. 

5. Clarification or determination of useful mutation point 

Many useful mutants of coryneform bacteria which 
are suitable for the production of useful substances, such 
as amino acids, nucleic acids, vitamins, saccharides, 
organic acids, and the like, have been obtained. However, 
it is hardly known which mutation point is imparted to a 
gene to improve the productivity. 

However, mutation points contained in production 
strains can be identified by comparing desired sequences of 
the genome DNA of the production strains obtained from 
coryneform bacteria by the mutagenic technique with the 
nucleotide sequences of the corresponding genome DNA and 
ORF derived from coryneform bacteria determined by the 
methods of the above items 1 and 2 and analyzing them 

Moreover, effective mutation points contributing to 
the production can be easily specified from among these 
mutation points on the basis of known information relating 



to the metabolic pathways, the metabolic regulatory 
mechanisms, the structure activity correlation of enzymes, 
and the like. 

When any efficient mutation can be hardly specified 
based on known data, the mutation points thus identified 
can be introduced into a wild strain of coryneform bacteria 
or a production strain free of the mutation. Then, it is 
examined whether or not any positive effect can be achieved 
on the production. 

For example, by comparing the nucleotide sequence 
of homoserine dehydrogenase gene horn of a lysine-producing 
B-6 strain of Corynebacterlum giutamicum (Appl. Microbiol. 
Bio techno 1. , 32: 269-273 (1989)) with the nucleotide 
sequence corresponding to the genome of Corynebacterlum 
giutamicum ATCC 13032 according to the present invention, a 
mutation of amino acid replacement in which valine at the 
59-position is replaced with alanine (Val59Ala) was 
identified. A strain obtained by introducing this mutation 
into the ATCC 13032 strain by the gene replacement method 
can produce lysine, which indicates that this mutation is 
an effective mutation contributing to the production of 
lysine . 

Similarly, by comparing the nucleotide sequence of 
pyruvate carboxylase gene pyc of the B-6 strain with the 
nucleotide sequence corresponding to the ATCC 13032 genome, 
a mutation of amino acid replacement in which proline at 



the 458-position was replaced with serine (Pro458Ser) was 
identified. A strain obtained by introducing this mutation 
into a lysine-producing strain of No. 58 (FERM BP-7134) of 
Corynebacterlvm glutamlcvan free of this mutation shows an 
improved lysine productivity in comparison with the No. 58 
strain, which indicates that this mutation is an effective 
mutation contributing to the production of lysine. 

In addition, a mutation Ala213Thr in glucose-6- 
phosphate dehydrogenase was specified as an effective 
mutation relating to the production of lysine by detecting 
glucose- 6-phosphate dehydrogenase gene zvjf of the B-6 
strain . 

Furthermore, the lysine -productivity of 

Corynebacterlum glutamlcum was improved by replacing the 
base at the 932 -position of aspartokinase gene lysC of the 
Corynehacterlum glutamlcum ATCC 13032 genome with cytosine 
to thereby replace threonine at the 311 -position by 
isoleucine, which indicates that this mutation is an 
effective mutation contributing to the production of lysine. 

Also, as another method to examine whether or not 
the identified mutation point is an effective mutation, 
there is a method in which the mutation possessed by the 
lysine-producing strain is returned to the sequence of a 
wild type strain by the gene replacement method and whether 
or not it has a negative influence on the lysine 
productivity. For example, when the amino acid replacement 
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mutation Val59Ala possessed by horn of the lysine-producing 
B-6 strain was returned to a wild type amino acid sequence, 
the lysine productivity was lowered in comparison with the 
B-6 strain. Thus, it was found that this mutation is an 
effective mutation contributing to the production of lysine 
Effective mutation points can be more efficiently 
and comprehensively extracted by combining, if needed, the 
DNA array analysis or proteome analysis described below. 

6 . Method of breeding industrially advantageous production 
strain 

It has been a general practice to construct 
production strains, which are used industrially in the 
fermentation production of the target useful substances, 
such as amino acids, nucleic acids, vitamins, saccharides, 
organic acids, and the like, by repeating mutagenesis and 
breeding based on random mutagenesis using mutagens, such 
as NTG or the like, and screening. 

In recent years, many examples of improved 
production strains have been made through the use of 
recombinant DNA techniques. In breeding, however, most of 
the parent production strains to be improved are mutants 
obtained by a conventional mutagenic procedure (W. 
Leuchtenberger , Amino Acids - Technical Production and Use. 
In: Roehr (ed) Biotechnology, second edition, vol. 6, 



products of primary metabolism, VCH Verlagsgesellschaf t mbH, 
Weinheim, P 465 (1996)). 

Although mutagenesis methods have largely 
contributed to the progress of the fermentation industry, 
they suffer from a serious problem of multiple, random 
introduction of mutations into every part of the chromosome. 
Since many mutations are accumulated in a single chromosome 
each time a strain is improved, a production strain 
obtained by the random mutation and selecting is generally 
inferior in properties (for example, showing poor growth, 
delayed consumption of saccharides, and poor resistance to 
stresses such as temperature and oxygen) to a wild type 
strain, which brings about troubles such as failing to 
establish a sufficiently elevated productivity, being 
frequently contaminated with miscellaneous bacteria, 
requiring troublesome procedures in culture maintenance, 
and the like, and, in its turn, elevating the production 
cost in practice. In addition, the improvement in the 
productivity is based on random mutations and thus the 
mechanism thereof is unclear. Therefore, it is very 
difficult to plan a rational breeding strategy for the 
subsequent improvement in the productivity. 

According to the present invention, effective 
mutation points contributing to the production can be 
efficiently specified from among many mutation points 
accumulated in the chromosome of a production strain which 
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has been bred from coryneform bacteria and, therefore, a 
novel breeding method of assembling these effective 
mutations in the coryneform bacteria can be established. 
Thus, a useful production strain can be reconstructed. It 
is also possible to construct a useful production strain 
from a wild type strain. 

Specifically, a useful mutant can be constructed in 
the following manner. 

One of the mutation points is incorporated into a 
wild type strain of coryneform bacteria. Then, it is 
examined whether or not a positive effect is established on 
the production. When a positive effect is obtained, the 
mutation point is saved. When no effect is obtained, the 
mutation point is removed. Subsequently, only a strain 
having the effective mutation point is used as the parent 
strain, and the same procedure is repeated. In general, 
the effectiveness of a mutation positioned upstream cannot 
be clearly evaluated in some cases when there is a rate- 
determining point in the downstream of a biosynthesis 
pathway. It is therefore preferred to successively 

evaluate mutation points upward from downstream. 

By reconstituting effective mutations by the method 
as described above in a wild type strain or a strain which 
has a hi gh growth speed or the same abi 1 i ty to consume 
saccharides as the wild type strain, it is possible to 
construct an industrially advantageous strain which is free 
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of troubles in the previous methods as described above and 
to conduct fermentation production using such strains 
within a short time or at a higher temperature. 

For example, a lysine-producing mutant B-6 (Appl. 
Mlcrohiol. Bio techno 1. , 32: 262-273 (1989)), which is 
obtained by multiple rounds of random mutagenesis from a 
wild type strain Corynehacterlum glutamlcvun ATCC 13032 , 
enables lysine fermentation to be performed at a 
temperature between 30 and 34°C but shows lowered growth 
and lysine productivity at a temperature exceeding 34°C. 
Therefore, the fermentation temperature should be 
maintained at 34°C or lower. In contrast thereto, the 
production strain described in the above item 5, which is 
obtained by reconstituting effective mutations relating to 
lysine production, can achieve a productivity at 40 to 42°C 
equal or superior to the result obtained by culturing at 30 
to 34°C. Therefore, this strain is industrially 

advantageous since it can save the load of cooling during 
the fermentation. 

When culture should be carried out at a high 
temperature exceeding 43°C, a production strain capable of 
conducting fermentation production at a high temperature 
exceeding 43°C can be obtained by reconstituting useful 
mutations in a microorganism belonging to the genus 
Corynehacterlum which can grow at high temperature 
exceeding 43°C. Examples of the microorganism capable of 
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growing at a high temperature exceeding 43°C include 
Coryne&acterivm thezmoaminogenes , such as Corynebacterium 
thezmoamlnogezies FERM 9244, PERM 9245, PERM 924 6 and 
FERM 9247. 

A strain having a further improved productivity of 
the target product can be obtained using the thus 
reconstructed strain as the parent strain and further 
breeding it using the conventional mutagenesis method, the 
gene ampl if i cation method, the gene replacement method 
using the recombinant DNA technique , the transduction 
method or the cell fusion method. Accordingly, the 
microorganism of the present invention includes, but is not 
limited to, a mutant, a cell fusion strain, a transf ormant , 
a transductant or a recombinant strain constructed by using 
recombinant DNA technitjues, so long as it is a producing 
strain obtained via the step of accumulating at least two 
effective mutations in a coryneform bacteria in the course 
of breeding. 

When a mutation point judged as being harmful to 
the growth or production is specified, on the other hand, 
it is examined whether or not the producing strain used at 
present contains the mutation point. When it has the 
mutation, it can be returned to the wild type gene and thus 
a further useful production strain can be bred. 

The breeding method as described above is 
applicable to microorganisms, other than coryneform 
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bacteria, which have industrially advantageous properties 
(for example, microorganisms capable of quickly utilizing 
less expensive carbon sources, microorganisms capable of 
growing at higher temperatures) . 

7. Production and utilization of polynucleotide array 
(1) Production of polynucleotide array 

A polynucleotide array can be produced using the 
polynucleotide or oligonucleotide of the present invention 
obtained in the above items 1 and 2. 

Examples include a polynucleotide array comprising 
a solid support to which at least one of a polynucleotide 
comprising the nucleotide sequence represented by SEQ ID 
NOS:2 to 3501, a polynucleotide which hybridizes with the 
polynucleotide under stringent conditions, and a 
polynucleotide comprising 10 to 200 continuous nucleotides 
in the nucleotide sequence of the polynucleotide is 
adhered; and a polynucleotide array comprising a solid 
support to which at least one of a polynucleotide encoding 
a polypeptide comprising the amino acid sequence 
represented by any one of SEQ ID NOS:3502 to 7001, a 
polynucleotide which hybridizes with the polynucleotide 
under stringent conditions, and a polynucleotide comprising 
10 to 200 continuous bases in the nucleotide sequences of 
the polynucleotides is adhered. 
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Polynucleotide arrays of the present invention 
include substrates known in the art, such as a DNA chip, a 
DNA microarray and a DNA macroarray, and the like, and 
comprises a solid support and plural polynucleotides or 
fragments thereof which are adhered to the surface of the 
solid support. 

Examples of the solid support include a glass plate, 
a nylon membrane, and the like. 

The polynucleotides or fragments thereof adhered to 
the surface of the solid support can be adhered to the 
surface of the solid support using the general technique 
for preparing arrays. Namely, a method in which they are 
adhered to a chemically surface-treated solid support, for 
example, to which a polycation such as polylysine or the 
like has been adhered (Nat. Genet., 21: 15-19 (1999)). The 
chemically surface-treated supports are commercially 
available and the commercially available solid product can 
be used as the solid support of the polynucleotide array 
according to the present invention. 

As the polynucleotides or oligonucleotides adhered 
to the solid support, the polynucleotides and 
oligonucleotides of the present invention obtained in the 
above items 1 and 2 can be used. 

The analysis described below can be efficiently 
performed by adhering the polynucleotides or 
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oligonucleotides to the solid support at a high density, 
though a high fixation density is not always necessary. 

Apparatus for achieving a high fixation density, 
such as an arrayer robot or the like, is commercially 
available from Takara Shuzo (GMS417 Arrayer) , and the 
commercially available product can be used. 

Also, the oligonucleotides of the present invention 
can be synthesized directly on the solid support by the 
photolithography method or the like (Nat. Genet., 21: 20-24 
(1999) ) . In this method, a linker having a protective 
group which can be removed by light irradiation is first 
adhered to a solid support, such as a slide glass or the 
like. Then, it is irradiated with light through a mask (a 
photolithograph mask) permeating light exclusively at a 
definite part of the adhesion part. Next, an 

oligonucleotide having a protective group which can be 
removed by light irradiation is added to the part. Thus, a 
ligation reaction with the nucleotide arises exclusively at 
the irradiated part. By repeating this procedure, 

oligonucleotides, each having a desired sequence, different 
from each other can be synthesized in respective parts. 
Usually, the oligonucleotides to be synthesized have a 
length of 10 to 30 nucleotides. 
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(2) Use of polynucleotide array 

The following procedures (a) and (b) can be carried 
out using the polynucleotide array prepared in the above 
(1) . 

(a) Identification of mutation point of coryneform 
bacterium mutant and analysis of expression amount and 
expression profile of gene encoded by genome 

By subjecting a gene derived from a mutant of 
coryneform bacteria or an examined gene to the following 
steps (i) to (iv) , the mutation point of the gene can be 
identified or the expression amount and expression profile 
of the gene can be analyzed: 

(i) producing a polynucleotide array by the method of 
the above (1) ; 

(ii) incubating polynucleotides immobilized on the 
polynucleotide array together with the labeled gene derived 
from a mutant of the coryneform bacterium using the 
polynucleotide array produced in the above (i) under 
hybridization conditions ; 

(iii) detecting the hybridization; and 

(iv) analyzing the hybridization data. 

The gene derived from a mutant of coryneform 
bacteria or the examined gene include a gene relating to 
biosynthesis of at least one selected from amino acids, 
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nucleic acids, vitamins, saccharides, organic acids, and 
analogues thereof. 

The method will be described in detail. 

A single nucleotide polymorphism (SNP) in a human 
region of 2,300 kb has been identified using polynucleotide 
arrays (Science, 280: 1077-82 (1998)). In accordance with 
the method of identifying SNP and methods described in 
Science, 278: 680-686 (1997) ; Proc. Natl. Acad. Scl. USA, 
96: 12833-38 (1999) ; Science, 284: 1520-23 (1999) , and the 
like using the polynucleotide array produced in the above 
(1) and a nucleic acid molecule (DNA, RNA) derived from 
coryneform bacteria in the method of the hybridization, a 
mutation point of a useful mutant, which is useful in 
producing an amino acid, a nucleic acid, a vitamin, a 
saccharide, an organic acid, or the like can be identified 
and the gene expression amount and the expression profile 
thereof can be analyzed. 

The nucleic acid molecule (DNA, RNA) derived from 
the coryneform bacteria can be obtained according to the 
general method described in Molecular Cloning, 2nd ed. or 
the like. mRNA derived from Corynebacterlum glutamlcum can 
also be obtained by the method of Bormann et ai. (Molecular 
Microbiology, 6: 317-326 (1992)) or the like. 

Although ribosomal RNA (rRNA) is usually obtained 
in large excess in addition to the target mRNA, the 
analysis is not seriously disturbed thereby. 
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The resulting nucleic acid molecule derived from 
coryneform bacteria is labeled. Labeling can be carried 
out according to a method using a fluorescent dye, a method 
using a radioisotope or the like. 

Specific examples include a labeling method in 
which psoralen-biotin is crosslinked with RNA extracted 
from a microorganism and, after hybridization reaction, a 
fluorescent dye having streptoavidin bound thereto is bound 
to the biotin moiety (Nat. Blotechnol. , 16: 45-48 (1998)); 
a labeling method in which a reverse transcription reaction 
is carried out using RNA extracted from a microorganism as 
a template and random primers as primers , and dUTP having a 
fluorescent dye (for example, Cy3, Cy5) (manufactured by 
Amersham Pharmacia Biotech) is incorporated into cDNA (Bxroc. 
Natl. Acad. Scl. USA, 96: 12833-38 (1999)); and the like. 

The labeling specificity can be improved by 
replacing the random primers by sequences complementary to 
the 3'-end of ORF (J\ Bacterlol. , 181: 6425-40 (1999)). 

In the hybridization method, the hybridization and 
subsequent washing can be carried out by the general method 
(Nat. Bloctechnol. , 14: 1675-80 (1996), or the like). 

Subsequently, the hybridization intensity is 
measured depending on the hybridization amount of the 
nucleic acid molecule used in the labeling. Thus, the 
mutation point can be identified and the expression amount 
of the gene can be calculated. 
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The hybridization intensity can be measured by 
visualizing the fluorescent signal, radioactivity, 
luminescence dose, and the like, using a laser confocal 
microscope, a CCD camera, a radiation imaging device (for 
example, STORM manufactured by Amer sham Pharmacia Biotech) , 
and the like, and then quantifying the thus visualized data. 

A polynucleotide array on a solid support can also 
be analyzed and quantified using a commercially available 
apparatus, such as GMS418 Array Scanner (manufactured by 
Takara Shuzo) or the like. 

The gene expression amount can be analyzed using a 
commercially available software (for example, ImaGene 
manufactured by Takara Shuzo; Array Gauge manufactured by 
Fuji Photo Film; ImageQuant manufactured by Amersham 
Pharmacia Biotech, or the like). 

A fluctuation in the expression amount of a 
specific gene can be monitored using a nucleic acid 
molecule obtained in the time course of culture as the 
nucleic acid molecule derived from coryneform bacteria. 
The culture conditions can be optimized by analyzing the 
fluctuation . 

The expression profile of the microorganism at the 
total gene level (namely, which genes among a great number 
of genes encoded by the genome have been expressed and the 
expression ratio thereof) can be determined using a nucleic 
acid molecule having the sequences of many genes determined 



from the full genome sequence of the microorganism. Thus, 
the expression amount of the genes determined by the full 
genome sequence can be analyzed and, in its turn, the 
biological conditions of the microorganism can be 
recognized as the expression pattern at the full gene level . 

(b) Confirmation of the presence of gene homologous to 
examined gene in coryneform bacteria 

Whether or not a gene homologous to the examined 
gene, which is present in an organism other than coryneform 
bacteria, is present in coryneform bacteria can be detected 
using the polynucleotide array prepared in the above (1) . 

This detection can be carried out by a method in 
which an examined gene which is present in an organism 
other than coryneform bacteria is used instead of the 
nucleic acid molecule derived from coryneform bacteria used 
in the above identification/analysis method of (1) . 

8. Recording medium storing full genome nucleotide sequence 
and ORF data and being readable by a computer 4 and methods 
for using the same 

The term "recording medium or storage device which 
is readable by a computer" means a recording medium or 
storage medium which can be directly readout and accessed 
with a computer. Examples include magnetic recording media, 
such as a floppy disk, a hard disk, a magnetic tape, and 



the like; optical recording media, such as CD-ROM, CD-R, 
CD-RW, DVD-ROM, DVD-RAM, DVD-RW, and the like; electric 
recording media, such as RAM, ROM, and the like; and 
hybrids in these categories (for example, magnetic/optical 
recording media, such as MO and the like) . 

Instruments for recording or inputting in or on the 
recording medium or instruments or devices for reading out 
the information in the recording medium can be 
appropriately selected, depending on the type of the 
recording medium and the access device utilized. Also, 
various data processing programs, software, comparator and 
formats are used for recording and utilizing the 
polynucleotide sequence information or the like, of the 
present invention in the recording medium. The information 
can be expressed in the form of a binary file, a text file 
or an ASCII file formatted with commercially available 
software, for example. Moreover, software for accessing 
the sequence information is available and known to one of 
ordinary skill in the art. 

Examples of the information to be recorded in the 
above-described medium include the full genome nucleotide 
sequence information of coryneform bacteria as obtained in 
the above item 2, the nucleotide sequence information of 
ORF, the amino acid sequence information encoded by the ORF, 
and the functional information of polynucleotides coding 
for the amino acid sequences. 
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The recording medium or s torage devi ce whi ch is 
readable by a computer according to the present invention 
refers to a medium in which the information of the present 
invention has been recorded. Examples include recording 
media or storage devices which are readable by a computer 
storing the nucleotide sequence information represented by 
SEQ ID NOS:l to 3501, the amino acid sequence information 
represented by SEQ ID NOS:3502 to 7001, the functional 
information of the nucleotide sequences represented by SEQ 
ID NOS:l to 3501, the functional information of the amino 
acid sequences represented by SEQ ID NOS:3502 to 7001, and 
the information listed in Table 1 below and the like. 

9. System based on a computer using the recording medium of 
the present invention which is readable by a computer 

The term "system based on a computer" as used 
herein refers a system composed of hardware device (s), 
software device (s), and data recording device (s) which are 
used for analyzing the data recorded in the recording 
medium of the present invention which is readable by a 
computer . 

The hardware device (s) are, for example, composed 
of an input unit, a data recording unit, a central 
processing unit and an output unit collectively or 
individually . 
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By the software device (s) , the data recorded in the 
recording medium of the present invention are searched or 
analyzed using the recorded data and the hardware device (s) 
as described herein. Specifically, the software device (s) 
contain at least one program which acts on or with the 
system in order to screen, analyze or compare biologically 
meaningful structures or information from the nucleotide 
sequences, amino acid sequences and the like recorded in 

0 the recording medium according to the present invention. 

h ~4 Examples of the software device (s) for identifying 

13 ORF and EMF domains include GeneMark (Nuc. Acids. Res. , 

fU 22: 4756-67 (1994)), GeneHacker (Protein, Nucleic Acid and 

1 Enzyme, 42: 3001-07 (1997)), Glimmer (The Institute of 
hi Genomic Research; Nuc. Acids. Res., 26: 544-548 (1998)) and 
|^ the like. In the process of using such a software device, 
S the default (initial setting) parameters are usually used, 

although the parameters can be changed, if necessary/ in a 
manner known to one of ordinary skill in the art. 

Examples of the software device (s) for identifying 
a genome domain or a polypeptide domain analogous to the 
target sequence or the target structural motif (homology 
searching) include FASTA, BLAST , Smi th- Waterman , GenetyxMac 
(manufactured by Software Development) , GCG Package 
(manufactured by Genetic Computer Group) , GenCore 
(manufactured by Compugen) , and the like. In the process 
of using such a software device, the default (initial 
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setting) parameters are usually used, although the 
parameters can be changed, if necessary, in a manner known 
to one of ordinary skill in the art. 

Such a recording medium storing the full genome 
sequence data is useful in preparing a polynucleotide array 
by which the expression amount of a gene encoded by the 
genome DNA of coryneform bacteria and the expression 
profile at the total gene level of the microorganism, 
namely, which genes among many genes encoded by the genome 
have been expressed and the expression ratio thereof, can 
be determined. 

The data recording device (s) provided by the 
present invention are, for example, memory device (s) for 
recording the data recorded in the recording medium of the 
present invention and target sequence or target structural 
motif data, or the like, and a memory accessing device (s) 
for accessing the same. 

Namely, the system based on a computer according to 
the present invention comprises the following: 

(i) a user input device that inputs the information 
stored in the recording medium of the present invention, 
and target sequence or target structure motif information; 

(ii) a data storage device for at least temporarily 
storing the input information; 

(iii) a comparator that compares the information stored 
in the recording medium of the present invention with the 



target sequence or target structure motif information, 
recorded by the data storing device of (ii) for screening 
and analyzing nucleotide^ sequence information which is 
coincident with or analogous to the target sequence or 
target structure motif information; and 

(iv) an output device that shows a screening or 
analyzing result obtained by the comparator. 

This system is usable in the methods in items 2 to 
5 as described above for searching and analyzing the ORF 
and EMF domains, target sequence, target structural motif, 
etc- of a coryneform bacterium, searching homologs, 
searching and analyzing isozymes, determining the 
biosynthesis pathway and the signal transmission pathway, 
and identifying spots which have been found in the proteome 
analysis. The term "homologs" as used herein includes both 
of orthologs and paralogs. 

10 . Production of polypeptide using ORF derived from 
coryneform bacteria 

The polypeptide of the present invention can be 
produced using a polynucleotide comprising the ORF obtained 
in the above item 2. Specifically, the polypeptide of the 
present invention can be produced by expressing the 
polynucleotide of the present invention or a fragment 
thereof in a host cell, ' using the method described in 
Molecular cloning-, 2nd eck , Current Protocols in Molecular 



Biology, and the like, for example, according to the 
following method. 

A DNA fragment having a suitable length containing 
a part encoding the polypeptide is prepared from the full 
length ORF sequence, if necessary. 

Also, DNA in which nucleotides in a nucleotide 
sequence at a part encoding the polypeptide of the present 
invention are replaced to give a codon suitable for 
expression of the host cell, if necessary. The DNA is 
useful for efficiently producing the polypeptide of the 
present invention . 

A recombinant vector is prepared by inserting the 
DNA fragment into the downstream of a promoter in a 
suitable expression vector. 

The recombinant vector is introduced to a host cell 
suitable for the expression vector. 

Any of bacteria, yeasts, animal cells, insect cells, 
plant cells, and the like can be used as the host cell so 
long as it can be expressed in the gene of interest. 

Examples of the expression vector include those 
which can replicate autonomously in the above -de scribed 
host cell or can be integrated into chromosome and have a 
promoter at such a position that the DNA encoding the 
polypeptide of the present invention can be transcribed. 

When a procaryote cell, such as a bacterium or the 
like, is used as the host cell, it is preferred that the 



recombinant vector containing the DNA encoding the 
polypeptide of the present invention can replicate 
autonomously in the bacterium and is a recombinant vector 
constituted by, at least a promoter, a ribosome binding 
sequence, the DNA of the present invention and a 
transcription termination sequence. A promoter controlling 
gene can also be contained therewith in operable 
combination . 

Examples of the expression vectors include a vector 
plasmid which is replicable in Corynebacterlum glutamlcum, 
such as pCGl (Japanese Published Unexamined Patent 
Application No. 134500/82) , pCG2 (Japanese Published 
Unexamined Patent Application No. 35197/83), pCG4 (Japanese 
Published Unexamined Patent Application No. 183799/82) , 
pCGll (Japanese Published Unexamined Patent Application No. 
134500/82) , pCG116, pCE54 and pCBlOl (Japanese Published 
Unexamined Patent Application No. 105999/83), pCE51, pCE52 
and pCE53 (Mol. Gen. Genet., 196: 175-178 (1984)), and the 
like; a vector plasmid which is replicable in Escherichia 
coll, such as pET3 and pETll (manufactured by Stratagene) , 
pBAD, pThioHis and pTrcHis (manufactured by Invitrogen) , 
pKK223-3 and pGEX2T (manufactured by Amersham Pharmacia 
Biotech), and the like; and pBTrp2 , pBTacl and pBTac2 
(manufactured by Boehringer Mannheim Co.), pSE280 
(manufactured by Invitrogen) , pGEMEX-1 (manufactured by 
Promega) , pQE-8 (manufactured by QIAGEN) , pKYPIO (Japanese 
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Published Unexamined Patent Application No. 110600/83) , 
pKYP200 (Agric. Biol. Chem. , 48: 669 (1984)), pLSAl (Agric. 
Biol. Chem. , 53: 277 (1989)), pGELl (Proc. Natl. Acad. Sci. 
USA, 82: 4306 (1985)), pBluescript II SK(-) (manufactured 
by Stratagene) , pTrs30 (prepared from Escherichia, coll 
JM109/pTrS30 (FERM BP-5407) ) , pTrs32 (prepared from 
Escherichia coll JM109/pTrS32 (FERM BP-5408) ) , pGHA2 

(prepared from Escherichia coll IGHA2 (FERM B-400) , 
Japanese Published Unexamined Patent Application No. 
221091/85) , pGKA2 (prepared from Escherichia coll IGKA2 

(FERM BP-6798) , Japanese Published Unexamined Patent 
Application No. 221091/85), pTerm2 (U.S. Patents 4,686,191, 
4,939,094 and 5,160,735), pSupex, pUBHO, pTP5 , pC194 and 
pEG400 (J. Bacterid. , 172: 2392 (1990)), pGEX 

(manufactured by Pharmacia) , pET system (manufactured by 
Novagen) , and the like. 

Any promoter can be used so long as it can function 
in the host cell. Examples include promoters derived from 
Escherichia coll, phage and the like, such as trp promoter 

< p txp) / lac promoter, P L promoter, P R promoter, T7 promoter 
and the like. Also, artificially designed and modified 
promoters, such as a promoter in which two Ptrp are linked 
in series (P t rp x2 ) r tac promoter, lacT7 promoter letl 
promoter and the like, can be used. 

It is preferred to use a plasmid in which the space 
between Shine-Dalgarno sequence which is the ribosome 



binding sequence and the initiation codon is adjusted to an 
appropriate distance (for example, 6 to 18 nucleotides) . 

The transcription termination sequence is not 
always necessary for the expression of the DNA of the 
present invention. However, it is preferred to arrange the 
transcription terminating sequence at just downstream of 
the structural gene. 

One of ordinary skill in the art will appreciate 
that the codons of the above-described elements may be 
optimized, in a known manner, depending on the host cells 
and environmental conditions utilized. 

Examples of the host cell include microorganisms 
belonging to the genus Escherichia, the genus Serratla , the 
genus Bacillus, the genus Br*eviJbacterium, the genus 
Corynebacterlum, the genus Mlcrobacterlum , the genus 
Pseudomonas , and the like. Specific examples include 
Escherichia coll XLl-Blue, Escherichia coll XL2-Blue, 
Escherichia coll DH1 , Escherichia coll MC1000, Escherichia 
coll KY3276, Escherichia coll W1485, Escherichia coll JM109, 
Escherichia coll HB101, Escherichia coll No. 49, 
Escherichia coll W3110, Escherichia coll NY49, Escherichia 
coll GI698, Escherichia coll TBI, Serratla flcarla , 
Serratla fontlcola, Serratla liquefaciens, Serratla 
marcescens, Bacillus subtilis, Bacillus amyloliquefaciens, 
Corynebacterlum ammoniagenes , Brevlbacterium Immariophilum 
ATCC 14068, Brevibacterium saccharolyticum ATCC 14066, 
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Corynebacterlvm glutamicua ATCC 13032, Corynebacterzvm 
fflutamlcum ATCC 13869, Corynebacteriua g-lutaaicua ATCC 
14067 (prior genus and species: Brevibacterium rUavuai) , 
Corynebacterium glutamicum ATCC 13869 (prior genus and 
species: Brevihacterlum lactofermentvm, or Corynebacterxum 
lactorennentum) , Corynebacterium acetoacidophilum ATCC 
13870, Coryneha c terl um thermoaminogenes FEBM 9244, 
Microbacterium aamoaiaphiluzn ATCC 15354, Ps&udcmonas putida, 
Pseudomonas sp. D-0110, and the like. 

When Coryn e±>a a teri um glutamlcxan or an analogous 
microorganism is used as a host, an EMF necessary for 
expressing the polypeptide is not always contained in the 
vector so long as the polynucleotide of the present 
invention contains an EMF. When the EMF is not contained 
in the polynucleotide, it is necessary to prepare the EMF 
separately and ligate .it so as to be in operable 
combination. Also, when a higher expression amount or 
specific expression . regulation is necessary, it is 
necessary to ligate the EMF corresponding thereto so as to 
put the EMF in operable combination with the polynucleotide. 
Examples of using an externally ligated EMF are disclosed 
in Microbiology, 142: 1297-1309 (1996) . 

With regard to the method for the introduction of 
the recombinant vector, any method for introducing DNA into 
the above-described host cells, such as a method in which a 
calcium ion is used (Proe. Katl. Acad. Sci. USA, 69: 2110 
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(1972)), a protoplast method (Japanese Published Unexamined 
Patent Application No. 2483942/88) , the methods described 
in Gene, 27: 107 (1982) and Molecular & General Genetics, 
168: 111 (1979) and the like, can be used. 

When yeast is used as the host cell, examples of 
the expression vector include pYES2 (manufactured by 
Invitrogen) , YEpl3 (ATCC 37115) , YEp24 (ATCC 37051) , YCp50 
(ATCC 37419), pHS19, pHS15 , and the like. 

Any promoter can be used so long as it can be 
expressed in yeast. Examples include a promoter of a gene 
in the glycolytic pathway, such as hexose kinase and the 
like, PH05 promoter, PGK promoter, GAP promoter, ADH 
promoter, gal 1 promoter, gal 10 promoter, a heat shock 
protein promoter, MF al promoter, CUP 1 promoter, and the 
like . 

Examples of the host cell include microorganisms 
belonging to the genus Saccharomyces , the genus 
Schlzosaccharomyces , the genus Kluyveromyces , the genus 
Trlchosporon , the genus Schwannlomyces , the genus Blchla, 
the genus Candida and the like. Specific examples include 
Saccharomyces cerevlslae , Schlzosaccharomyces pombe , 
Kluyveromyces lactls , Trlchosporon pullulans , 

Schwannlomyces alluvlus, Candida utllls and the like. 

With regard to the method for the introduction of 
the recombinant vector, any method for introducing DNA into 
yeast, such as an electroporation method (Methods. Enzymol. , 
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194: 182 (1990)), a spheroplast method (Proc. Natl. Acad. 
Sci. USA, 75: 1929 (1978)), a lithium acetate method (J. 
Bacterid. , 153: 163 (1983)), a method described in Proc. 
Natl. Acad. Scl. USA, 75: 1929 (1978) and the like, can be 
used . 

When animal cells are used as the host cells, 
examples of the expression vector include pcDNA3.1, 
pSinRepS and pCEP4 (manufactured by Invitorogen) , pRev-Tre 

(manufactured by Clontech) , pAxCAwt (manufactured by Takara 
Shuzo) , pcDNAI and pcDM8 (manufactured by Funakoshi) , 
pAGE107 (Japanese Published Unexamined Patent Application 
No. 22979/91; Cyto techno logy , 3:133 (1990)), pAS3-3 

(Japanese Published Unexamined Patent Application No. 
227075/90), pcDM8 (Nature, 329: 840 (1987)), pcDNAI/Amp 

(manufactured by Invitrogen) , pREP4 (manufactured by 
Invitrogen) , pAGE103 (J\ Blochem. , 101: 1307 (1987)), 
pAGE210, and the like. 

Any promoter can be used so long as it can function 
in animal cells. Examples include a promoter of IE 

(immediate early) gene of cytomegalovirus (CMV) , an early 
promoter of SV4 0, a promoter of retrovirus, a 
metallothionein promoter, a heat shock promoter, SRa 
promoter, and the like. Also, the enhancer of the IE gene 
of human CMV can be used together with the promoter. 

Examples of the host cell include human Namalwa 
cell, monkey COS cell, Chinese hamster CHO cell, HST5637 
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(Japanese Published Unexamined Patent Application No. 
299/88) , and the like. 

The method for introduction of the recombinant 
vector into animal cells is not particularly limited, so 
long as it is the general method for introducing DNA into 
animal cells, such as an electroporation method 
(Cy to technology, 3: 133 (1990) ) , a calcium phosphate method 
(Japanese Published Unexamined Patent Application No. 
227075/90) , a lipofection method (Proa. Natl. Acad. Scl. 
USA, 84, 7413 (1987)), the method described in Virology, 
52: 456 (1973), and the like. 

When insect cells are used as the host cells, the 
polypeptide can be expressed, for example, by the method 
described in Bacurovlrus Expression Vectors, A Laboratory 
Manual, W.H. Freeman and Company, New York (1992), 
Bio/Technology, 6: 47 (1988) , or the like. 

Specifically, a recombinant gene transfer vector 
and bacurovirus are simultaneously inserted into insect 
cells to obtain a recombinant virus in an insect cell 
culture supernatant, and then the insect cells are infected 
with the resulting recombinant virus to express the 
polypeptide . 

Examples of the gene introducing vector used in the 
method include pBlueBac4.5, pVL1392, pVL1393 and 
pBlueBacIII (manufactured by Invitrogen) , and the like. 
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Examples of the bacurovirus include ' Autographa 
californica nuclear polyhedrosis virus with which insects 
of the family Barathra are infected, and the like. 

Examples of the insect cells include Spodoptera 
fmglperda oocytes Sf9 and Sf21 (Bacurovirus Expression 
Vectors, A Laboratory Manual, W.H. Freeman and Company, New 
York (1992)), Trlchoplusla nl oocyte High 5 (manufactured 
by Invitrogen) and the like. 

The method for simultaneously incorporating the 
above -de scribed recombinant gene transfer vector and the 
above-described bacurovirus for the preparation of the 
recombinant virus include calcium phosphate method 
(Japanese Published Unexamined Patent Application No. 
227075/90) , lipofection method {Proc. Natl. Acad. Scl. USA, 
84: 7413 (1987)) and the like. 

When plant cells are used as the host cells, 
examples of expression vector include a Ti plasmid, a 
tobacco mosaic virus vector, and the like. 

Any promoter can be used so long as it can be 
expressed in plant cells. Examples include 35S promoter of 
cauliflower mosaic virus (CaMV) , rice actin 1 promoter, and 
the like. 

Examples of the host cells include plant cells and 
the like, such as tobacco, potato, tomato, carrot, soybean, 
rape, alfalfa, rice, wheat, barley, and the like. 
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The method for introducing the recombinant vector 
is not particularly limited, so long as it is the general 
method for introducing DNA into plant cells, such as the 
Agrohacterlum method (Japanese Published Unexamined Patent 
Application No. 140885/84, Japanese Published Unexamined 
Patent Application No. 70080/85, WO 94/00977), the 
electroporation method (Japanese Published Unexamined 
Patent Application No. 251887/85) , the particle gun method 

_ (Japanese Patents 2606856 and 2517813) , and the like. 

'0 The transformant of the present invention includes 

W a transformant containing the polypeptide of the present 

\u 

\J\ invention per se rather than as a recombinant vector, that 

l y 

VS\ is, a transformant containing the polypeptide of the 

present invention which is integrated into a chromosome of 

\1 the host, in addition to the transformant containing the 

'A above recombinant vector. 

When expressed in yeasts, animal cells, insect 
cells or plant cells, a glycopolypeptide or glycosylated 
polypeptide can be obtained. 

The polypeptide can be produced by culturing the 
thus obtained transformant of the present invention in a 
culture medium to produce and accumulate the polypeptide of 
the present invention or any polypeptide expressed under 
the control of an EMF of the present invention, and 
recovering the polypeptide from the culture. 
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Culturing of" the transformant of the present 
invention in a culture medium is carried out according to 
the conventional method as used in culturing of the host. 

When the transformant of the present invention is 
obtained using a prokaryote, such as Escherichia coll or 
the like, or a eukaryote, such as yeast or the like, as the 
host, the transformant is cultured. 

Any of a natural medium and a synthetic medium can 
„ be used, so long as it contains a carbon source, a nitrogen 

;~ source, an inorganic salt and the like which can be 

1^ assimilated by the transformant and can perform culturing 

]*\ of the transformant efficiently. 

Examples of the carbon source include those which 
I s * can be assimilated by the transformant, such as 

M carbohydrates (for example, glucose, fructose, sucrose, 

□ molasses containing them, starch, starch hydrolysate, and 

the like) , organic acids (for example, acetic acid, 
propionic acid, and the like) , and alcohols (for example, 
ethanol , propanol , and the like) . 

Examples of the nitrogen source include ammonia, 
various ammonium salts of inorganic acids or organic acids 
(for example, ammonium chloride, ammonium sulfate, ammonium 
acetate, ammonium phosphate, and the like) , other nitrogen- 
containing compounds, peptone, meat extract, yeast extract, 
corn steep liquor, casein hydrolysate, soybean meal and 
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soybean meal hydrolysate, various fermented cells and 
hydrolysates thereof, and the like. 

Examples of inorganic salt include potassium 
dihydrogen phosphate, dipotassium hydrogen phosphate, 
magnesium phosphate, magnesium sulfate, sodium chloride, 
ferrous sulfate, manganese sulfate, copper sulfate, calcium 
carbonate, and the like. 

The culturing is carried out under aerobic 
conditions by shaking culture, submerged-aeration stirring 
culture or the like. The culturing temperature is 

preferably from 15 to 40°C, and the culturing time is 
generally from 16 hours to 7 days. The pH of the medium is 
preferably maintained at 3.0 to 9.0 during the culturing. 
The pH can be adjusted using an inorganic or organic acid, 
an alkali solution, urea, calcium carbonate, ammonia, or 
the like. 

Also, antibiotics, such as ampicillin, tetracycline, 
and the like, can be added to the medium during the 
culturing, if necessary. 

When a microorganism transformed with a recombinant 
vector containing an inducible promoter is cultured, an 
inducer can be added to the medium, if necessary. 

For example , isopropyl-p-D-thiogalactopyranoside 
(IPTG) or the like can be added to the medium when a 
microorganism transformed with . a recombinant vector 
containing lac promoter is cultured, or indoleacrylic acid 
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(IAA) or the like can by added thereto when a microorganism 
transformed with an expression vector containing trp 
promoter is cultured. 

Examples of the medium used in culturing a 
transf ormant obtained using animal cells as the host cells 
include RPMI 164 0 medium (The Journal of the American 
Medical Association, 199: 519 (1967)), Eagle's MEM medium 
(Science, 122: 501 (1952)), Dulbecco's modified MEM medium 
(Virology, 8, 396 (1959)), 199 Medium (Proceeding- of the 
Society for the Biological Medicine, 73:1 (1950)), the 
above -described media to which fetal calf serum has been 
added, and the like. 

The culturing is carried out generally at a pH of 6 
to 8 and a temperature of 30 to 40°C in the presence of 5% 
C0 2 for 1 to 7 days . 

Also, if necessary, antibiotics, such as kanamycin, 
penicillin, and the like, can be added to the medium during 
the culturing. 

Examples of the medium used in culturing a 
transformant obtained using insect cells as the host cells 
include TNM-FH medium (manufactured by Pharmingen) , Sf-900 
II SFM (manufactured by Life Technologies) , ExCell 4 00 and 
ExCell 405 (manufactured by JRH Biosciences), Grace's 
Insect Medium (Nature, 195: 788 (1962)), and the like. 

The culturing is carried out generally at a pH of 6 
to 7 and a temperature of 25 to 30°C for 1 to 5 days. 
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Additionally, antibiotics, such as gentamicin and 
the like, can be added to the medium during the culturing, 
if necessary. 

A transformant obtained by using a plant cell as 
the host cell can be used as the cell or after 
differentiating to a plant cell or organ. Examples of the 
medium used in the culturing of the transformant include 
Murashige and Skoog (MS) medium, White medium, media to 
which a plant hormone, such as auxin, cytokinine, or the 
like has been added, and the like. 

The culturing is carried out generally at a pH of 5 
to 9 and a temperature of 20 to 40°C for 3 to 60 days. 

Also, antibiotics, such as kanamycin, hygromycin 
and the like, can be added to the medium during the 
culturing , if necessary . 

As described above, the polypeptide can be produced 
by culturing a transformant derived from a microorganism, 
animal cell or plant cell containing a recombinant vector 
to which a DNA encoding the polypeptide of the present 
invention has been inserted according to the general 
culturing method to produce and accumulate the polypeptide, 
and recovering the polypeptide from the culture. 

The process of gene expression may include 
secretion of the encoded protein production or fusion 
protein expression and the like in accordance with the 
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methods described in Molecular Cloning, 2nd ed. , in 
addition to direct expression. 

The method for producing the polypeptide of the 
present invention includes a method of intracellular 
expression in a host cell, a method of extracellular 
secretion from a host cell, or a method of production on a 
host cell membrane outer envelope. The method can be 
selected by changing the host cell employed or the 
structure of the polypeptide produced. 

When the polypeptide of the present invention is 
produced in a host cell or on a host cell membrane outer 
envelope, the polypeptide can be positively secreted 
extracellularly according to, for example, the method of 
Paulson et al. (J. Biol. Chem. , 264: 17619 (1989)), the 
method of Lowe et al. (Proc. Natl. Acad. Scl. USA, 86: 8227 
(1989); Genes Develop., 4: 1288 (1990)), and/or the methods 
described in Japanese Published Unexamined Patent 
Application No. 336963/93, WO 94/23021, and the like. 

Specifically, the polypeptide of the present 
invention can be positively secreted extracellularly by 
expressing it in the form that a signal peptide has been 
added to the foreground of a polypeptide containing an 
active site of the polypeptide of the present invention 
according to the recombinant DNA technique. 

Furthermore, the amount produced can be increased 
using a gene amplification system, such as by use of a 



dihydrofolate reductase gene or the like according to the 
method described in Japanese Published Unexamined Patent 
Application No. 227075/90. 

Moreover, the polypeptide of the present invention 
can be produced by a transgenic animal individual 
(transgenic nonhuman animal) or plant individual 
(transgenic plant) . 

When the transformant is the animal individual or 
plant individual, the polypeptide of the present invention 
can be produced by breeding or cultivating it so as to 
produce and accumulate the polypeptide, and recovering the 
polypeptide from the animal individual or plant individual . 

Examples of the method for producing the 
polypeptide of the present invention using the animal 
individual include a method for producing the polypeptide 
of the present invention in an animal developed by 
inserting a gene according to methods known to those of 
ordinary skill in the art (American Journal of Clinical 
Nutrition, 63: 639S (1996) , American Journal of Clinical 
Nutrition, 63: 627S (1996), Bio/Technology, 9: 830 (1991)). 

In the animal individual , the polypeptide can be 
produced by breeding a transgenic nonhuman animal to which 
the DNA encoding the polypeptide of the present invention 
has been inserted to produce and accumulate the polypeptide 
in the animal , and recovering the polypeptide from the 
animal. Examples of the production and accumulation place 
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in the animal include milk (Japanese Published Unexamined 
Patent Application No. 309192/88) , egg and the like of the 
animal. Any promoter can be used, so long as it can be 
expressed in the animal. Suitable examples include an a- 
casein promoter, a p-casein promoter, a p-lactoglobulin 
promoter, a whey acidic protein promoter, and the like, 
which are specific for mammary glandular cells. 

Examples of the method for producing the 
polypeptide of the present invention using the plant 
individual include a method for producing the polypeptide 
of the present invention by cultivating a transgenic plant 
to which the DNA encoding the protein of the present 
invention by a known method (Tissue Culture, 20 (1994), 
Tissue Culture, 21 (1994) , Trends in Biotechnology, 15: 45 
(1997) ) to produce and accumulate the polypeptide in the 
plant, and recovering the polypeptide from the plant. 

The polypeptide according to the present invention 
can also be obtained by translation in vitro. 

The polypeptide of the present invention can be 
produced by a translation system in vitro. There are, for 
example, two in vitro translation methods which may be used, 
namely, a method using RNA as a template and another method 
using DNA as a template. The template RNA includes the 
whole RNA, mRNA, an in vitro transcription product, and the 
like. The template DNA includes a plasmid containing a 
transcriptional promoter and a target gene integrated 
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therein and downstream of the initiation site, a PCR/RT-PCR 
product and the like. To select the most suitable system 
for the In vitro translation, the origin of the gene 
encoding the protein to be synthesized (prokaryotic 
cell/eucaryotic cell) , the type of the template (DNA/RNA) , 
the purpose of using the synthesized protein and the like 
should be considered. In vitro translation kits having 
various characteristics are commercially available from 
many companies (Boehringer Mannheim, Promega, Stratagene, 
or the like) , and every kit can be used in producing the 
polypeptide according to the present invention. 

Transcription/translation of a DNA nucleotide 
sequence cloned into a plasmid containing a T7 promoter can 
be carried out using an In vitro transcription/translation 
system E. coll T7 S30 Extract System for Circular DNA 
(manufactured by Promega, catalogue No. L1130) . Also, 
transcription/ translation using, as a template, a linear 
prokaryotic DNA of a supercoil non-sensitive promoter, such 
as lacUV5, tac, A,PL(con), A,PL, or the like, can be carried 
out using an In vitro transcription/translation system 
E. coll S30 Extract System for Linear Templates 
(manufactured by Promega, catalogue No. LI 030) . Examples 
of the linear prokaryotic DNA used as a template include a 
DNA fragment, a PCR-amplif ied DNA product, a duplicated 
oligonucleotide ligation, an In vitro transcriptional RNA, 
a prokaryotic RNA, and the like. 
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In addition to the production of the polypeptide 
according to the present invention, synthesis of a 
radioactive labeled protein, confirmation of the expression 
capability of a cloned gene, analysis of the function of 
transcriptional reaction or translation reaction, and the 
like can be carried out using this system. 

The polypeptide produced by the transformant of the 
present invention can be isolated and purified using the 
general method for isolating and purifying an enzyme. For 
example, when the polypeptide of the present invention is 
expressed as a soluble product in the host cells, the cells 
are collected by centrif ugation after cultivation, 
suspended in an aqueous buffer, and disrupted using an 
ultrasonicator , a French press, a Manton Gaulin homogenizer, 
a Dynomill, or the like to obtain a cell-free extract. 
From the supernatant obtained by centrif uging the cell -free 
extract, a purified product can be obtained by the general 
method used for isolating and purifying an enzyme, for 
example, solvent extraction, salting out using ammonium 
sulfate or the like, desalting, precipitation using an 
organic solvent, anion exchange chromatography using a 
resin, such as diethyl ami noethyl (DEAE ) -Sepharose , DIAION 
HPA-75 (manufactured by Mitsubishi Chemical) or the like, 
cation exchange chromatography using a resin, such as S- 
Sepharose FF (manufactured by Pharmacia) or the like, 
hydrophobic chromatography using a resin, such as butyl 
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sepharose, phenyl sepharose or the like, gel filtration 
using a molecular sieve, affinity chromatography, 
chroma tof ocusing, or electrophoresis, such as isoelectronic 
focusing or the like, alone or in combination thereof. 

When the polypeptide is expressed as an insoluble 
product in the host cells, the cells are collected in the 
same manner, disrupted and centrifuged to recover the 
insoluble product of the polypeptide as the precipitate 
fraction. Next, the insoluble product of the polypeptide 
is solubilized with a protein denaturing agent. The 
solubilized solution is diluted or dialyzed to lower the 
concentration of the protein denaturing agent in the 
solution. Thus, the normal configuration of the 

polypeptide is reconstituted. After the procedure, a 
purified product of the polypeptide can be obtained by a 
purification/isolation method similar to the above. 

When the polypeptide of the present invention or 
its derivative (for example, a polypeptide formed by adding 
a sugar chain thereto) is secreted out of cells, the 
polypeptide or its derivative can be collected in the 
culture supernatant. Namely, the culture supernatant is 
obtained by treating the culture medium in a treatment 
similar to the above (for example, centrif ugation) . Then, 
a purified product can be obtained from the culture medium 
using a purification/isolation method similar to the above. 
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The polypeptide obtained by the above method is 
within the scope of the polypeptide of the present 
invention, and examples include a polypeptide encoded by a 
polynucleotide comprising the nucleotide sequence selected 
from SEQ ID NOS:2 to 3431, and a polypeptide comprising an 
amino acid sequence represented by any one of SEQ ID 
NOS:3502 to 6931. 

Furthermore, a polypeptide comprising an amino acid 
sequence in which at least one amino acids is deleted, 
replaced, inserted or added in the amino acid sequence of 
the polypeptide and having substantially the same activity 
as that of the polypeptide is included in the scope of the 
present invention. The term "substantially the same 
activity as that of the polypeptide" means the same 
activity represented by the inherent function, enzyme 
activity or the like possessed by the polypeptide which has 
not been deleted, replaced, inserted or added. The 
polypeptide can be obtained using a method for introducing 
part-specific mutation (s) described in, for example, 
Molecular Cloning, 2nd ed. , Current Protocols In Molecular 
Biology, Nuc. Acids. Res., 10: 6487 (1982), Proc. Natl. 
Acad. Scl. USA, 79: 6409 (1982), Gene, 34: 315 (1985), Nuc. 
Acids. Res., 13: 4431 (1985), Proc . Natl. Acad. Scl. USA, 
82: 4 88 (1985) and the like. For example, the polypeptide 
can be obtained by introducing mutation (s) to DNA encoding 
a polypeptide having the amino acid sequence represented by 
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any one of SEQ ID NOS:3502 to 6931. The number of the 
amino acids which are deleted, replaced, inserted or added 
is not particularly limited; however, it is usually 1 to 
the order of tens, preferably 1 to 20, more preferably 1 to 
10, and most preferably 1 to 5 , amino acids. 

The at least one amino acid deletion, replacement, 
insertion or addition in the amino acid sequence of the 
polypeptide of the present invention is used herein to 
refer to that at least one amino acid is deleted, replaced, 
inserted or added to at one or plural positions in the 
amino acid sequence. The deletion, replacement, insertion 
or addition may be caused in the same amino acid sequence 
simultaneously. Also, the amino acid residue replaced, 
inserted or added can be natural or non-natural . Examples 
of the natural amino acid residue include L-alanine, 
L-asparagine, L-asparatic acid, L-glutamine, L-glutamic 
acid, glycine, L-histidine, L-isoleucine , L-leucine, 
L-lysine , L-methionine , L-phenylalanine , L-proline , 
L-serine, L-threonine, L- tryptophan , L-tyrosine, L-valine, 
L-cysteine, and the like. 

Herein, examples of amino acid residues which are 
replaced with each other are shown below. The amino acid 
residues in the same group can be replaced with each other. 
Group A: 
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leucine, isoleucine, norleucine, valine, norvaline, 
alanine, 2-aminobutanoic acid, methionine, O-methyl serine , 
t-butylglycine , t-butylalanine , cyclohexylalanine ; 
Group B: 

asparatic acid, glutamic acid, isoasparatic acid, 
isoglutamic acid, 2-aminoadipic acid, 2-aminosuberic acid; 
Group C : 

asparagine , glutamine ; 

Group D : 

lysine, arginine, ornithine, 2 , 4-diaminobutanoic 
acid, 2 , 3-diaminopropionic acid; 
Group E : 

proline, 3-hydroxyproline , 4-hydroxyproline; 

Group F : 

serine, threonine, homoserine; 

Group G: 

phenylalanine , tyrosine . 

Also, in order that the resulting mutant 
polypeptide has substantially the same activity as that of 
the polypeptide which has not been mutated, it is preferred 
that the mutant polypeptide has a homology of 60% or more, 
preferably 80% or more, and particularly preferably 95% or 
more, with the polypeptide which has not been mutated, when 
calculated, for example, using default (initial setting) 
parameters by a homology searching software, such as BLAST, 
FASTA, or the like. 



Also, the polypeptide of the present invention can 
be produced by a chemical synthesis method, such as Fmoc 
(f luorenylmethyloxycarbonyl ) method , tBoc 

(t-butyloxycarbonyl) method, or the like. It can also be 
synthesized using a peptide synthesizer manufactured by 
Advanced ChemTech, Perkin-Elmer , Pharmacia, Protein 
Technology Instrument, Synthecell-Vega , PerSeptive, 
Shimadzu Corporation, or the like. 

The transformant of the present invention can be 
used for objects other than the production of the 
polypeptide of the present invention. 

Specifically, at least one component selected from 
an amino acid, a nucleic acid, a vitamin, a saccharide, an 
organic acid, and analogues thereof can be produced by 
culturing the transformant containing the polynucleotide or 
recombinant vector of the present invention in a medium to 
produce and accumulate at least one component selected from 
amino acids, nucleic acids, vitamins, saccharides, organic 
acids, and analogues thereof, and recovering the same from 
the medium. 

The biosynthesis pathways, decomposition pathways 
and regulatory mechanisms of physiologically active 
substances such as amino acids, nucleic acids, vitamins, 
saccharides, organic acids and analogues thereof differ 
from organism to organism. The productivity of such a 
physiologically active substance can be improved using 
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these differences, specifically by introducing a 
heterogeneous gene relating to the biosynthesis thereof. 
For example, the content of lysine, which is one of the 
essential amino acids , in a plant seed was improved by 
introducing a synthase gene derived from a bacterium (WO 
93/19190) . Also, arginine is excessively produced in a 
culture by introducing an arginine synthase gene derived 
from Escherichia, coll (Japanese Examined Patent Publication 
23750/93) . 

To produce such a physiologically active substance, 
the transformant according to the present invention can be 
cultured by the same method as employed in culturing the 
transformant for producing the polypeptide of the present 
invention as described above. Also, the physiologically 
active substance can be recovered from the culture medium 
in combination with, for example, the ion exchange resin 
method, the precipitation method and other known methods. 

Examples of methods known to one of ordinary skill 
in the art include electroporation , calcium transf ection , 
the protoplast method, the method using a phage, and the 
like, when the host is a bacterium; and microinjection, 
calcium phosphate transf ection , the positively charged 
lipid-mediated method and the method using a virus, and the 
like, when the host is a eukaryote (Molecular Cloning*, 2nd 
ed. / Spector et al., Cells/a laboratory manual , Cold Spring 
Harbour Laboratory Press, 1998)). Examples of the host 
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include prokaryotes , lower eukaryotes (for example, yeasts) , 
higher eukaryotes (for example, mammals) , and cells 
isolated therefrom. As the state of a recombinant 
polynucleotide fragment present in the host cells , it can 
be integrated into the chromosome of the host. 
Alternatively, it can be integrated into a factor (for 
example, a plasmid) having an independent replication unit 
outside the chromosome. These transf ormants are usable in 
producing the polypeptides of the present invention encoded 
by the ORF of the genome of Corynebacterlum glutamlcum, the 
polynucleotides of the present invention and fragments 
thereof. Alternatively, they can be used in producing 
arbitrary polypeptides under the regulation by an EMF of 
the present invention. 

11. Preparation of antibody recognizing the polypeptide of 
the present invention 

An antibody which recognizes the polypeptide of the 
present invention, such as a polyclonal antibody, a 
monoclonal antibody, or the like, can be produced using, as 
an antigen, a purified product of the polypeptide of the 
present invention or a partial fragment polypeptide of the 
polypeptide or a peptide having a partial amino acid 
sequence of the polypeptide of the present invention . 

(1) Production of polyclonal antibody 
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A polyclonal antibody can be produced using, as an 
antigen, a purified product of the polypeptide of the 
present invention, a partial fragment polypeptide of the 
polypeptide, or a peptide having a partial amino acid 
sequence of the polypeptide of the present invention, and 
immunizing an animal with the same. 

Examples of the animal to be immunized include 
rabbits, goats, rats, mice, hamsters, chickens and the like. 

A dosage of the antigen is preferably 50 to 100 jj.g 
per animal . 

When the peptide . is used as the antigen, it is 
preferably a peptide covalently bonded to a carrier protein, 
such as keyhole limpet haemocyanin, bovine thyroglobulin , 
or the like. The peptide used as the antigen can be 
synthesized by a peptide synthesizer. 

The administration of the antigen is, for example, 
carried out 3 to 10 times at the intervals of 1 or 2 weeks 
after the first administration. On the 3rd to 7th day 
after each administration, a blood sample is collected from 
the venous plexus of the eyeground, and it is confirmed 
that the serum reacts with the antigen by the enzyme 
immunoassay {Enzyme- linked Immunosorbent Assay (ELISA) , 
Igaku Shoin (1976) ; Antibodies - A Laboratory Manual, Cold 
Spring Harbor Laboratory (1988)) or the like. 

Serum is obtained from the immunized non-human 
mammal with a sufficient antibody titer against the antigen 
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used for the immunization, and the serum is isolated and 
purified to obtain a polyclonal antibody. 

Examples of the method for the isolation and 
purification include centrif ugation , salting out by 40-50% 
saturated ammonium sulfate, caprylic acid precipitation 
(Antibodies, A Laboratory manual, Cold Spring Harbor 
Laboratory (1988)), or chromatography using a DEAE- 
Sepharose column, an anion exchange column, a protein A- or 
G-column, a gel filtration column, and the like, alone or 
in combination thereof, by methods known to those of 
ordinary skill in the art. 

(2) Production of monoclonal antibody 

(a) Preparation of antibody-producing cell 

A rat having a serum showing an enough antibody 
titer against a partial fragment polypeptide of the 
polypeptide of the present invention used for immunization 
is used as a supply source of an antibody-producing cell. 

On the 3rd to 7th day after the antigen substance 
is finally administered the rat showing the antibody titer, 
the spleen is excised. 

The spleen is cut to pieces in MEM medium 
(manufactured by Nissui Pharmaceutical) , loosened using a 
pair of forceps, followed by centrif ugation at 1,200 rpm 
for 5 minutes, and the resulting supernatant is discarded. 
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The spleen in the precipitated fraction is treated 
with a Tris -ammonium chloride buffer (pH 7.65) for 1 to 2 
minutes to eliminate erythrocytes and washed three times 
with MEM medium, and the resulting spleen cells are used as 
antibody-producing cells . 

(b) Preparation of myeloma cells 

As myeloma cells, an established cell line obtained 
from mouse or rat is used. Examples of useful cell lines 
include those derived from a mouse, such as P3-X63Ag8-Ul 

(hereinafter referred to as "P3-U1") (Curr. Topics In 
Microbiol. Immunol. , 81: 1 (1978); Europ. J. Immunol. , 

6: 511 (1976)); SP2/0-Agl4 (SP-2) (Nature, 276: 269 

(1978) ): P3-X63-Ag8653 (653) (J. Immunol. , 123: 1548 

(1979) ); P3-X63-Ag8 (X63) cell line (Nature, 256: 495 
(1975)), and the like, which are 8-azaguanine-resistant 

mouse (BALB/c) myeloma cell lines. These cell lines are 
subcultured in 8-azaguanine medium (medium in which, to a 
medium obtained by adding 1.5 mmol/1 glutamine, 5xl0" 5 
mol/1 2-mercaptoethanol , 10 \ig/ml gentamicin and 10% fetal 
calf serum (FCS) (manufactured by CSL) to RPMI-164 0 medium 
(hereinafter referred to as the "normal medium") , 8- 
azaguanine is further added at 15 jag/ml) and cultured in 
the normal medium 3 or 4 days before cell fusion, and 2xl0 7 
or more of the cells are used for the fusion. 
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(c) Production of hybridoma 

The antibody-producing cells obtained in (a) and 
the myeloma cells obtained in (b) are washed with MEM 
medium or PBS (di sodium hydrogen phosphate: 1.83 g, sodium 
dihydrogen phosphate: 0,21 g, sodium chloride: 7.65 g, 
distilled water: 1 liter, pH: 7.2) and mixed to give a 
ratio of antibody-producing cells : myeloma cells = 5:1 
to 10 : 1, followed by centrif ugation at 1,200 rpm for 5 
minutes, and the supernatant is discarded. 

The cells in the resulting precipitated fraction 
were thoroughly loosened, 0.2 to 1 ml of a mixed solution 
of 2 g of polyethylene glycol-1000 (PEG-1000) , 2 ml of MEM 
medium and 0.7 ml of dimethylsulf oxide (DMSO) per 10 8 
antibody-producing cells is added to the cells under 
stirring at 37°C, and then 1 to 2 ml of MEM medium is 
further added thereto several times at 1 to 2 minute 
intervals . 

After the addition, MEM medium is added to give a 
total amount of 50 ml. The resulting prepared solution is 
centrif uged at 900 rpm for 5 minutes, and then the 
supernatant is discarded. The cells in the resulting 
precipitated fraction were gently loosened and then gently 
suspended in 100 ml of HAT medium (the normal medium to 
which 10" 4 mol/1 hypoxanthine , 1.5xl0" 5 mol/1 thymidine and 
4xl0" 7 mol/1 aminopterin have been added) by repeated 
drawing up into and discharging from a measuring pipette. 
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The suspension is poured into a 96 well culture 
plate at 100 (il/well and cultured at 37°C for 7 to 14 days 
in a 5% C0 2 incubator. 

After culturing, a part of the culture supernatant 
is recovered, and a hybridoma which specifically reacts 
with a partial fragment polypeptide of the polypeptide of 
the present invention is selected according to the enzyme 
immunoassay described in Antibodies, A Laboratory manual, 
Cold Spring Harbor Laboratory, Chapter 14 (1998) and the 
like. 

A specific example of the enzyme immunoassay is 
described below. 

The partial fragment polypeptide of the polypeptide 
of the present invention used as the antigen in the 
immunization is spread on a suitable plate, is allowed to 
react with a hybridoma culturing supernatant or a purified 
antibody obtained in (d) described below as a first 
antibody, and is further allowed to react with an anti-rat 
or anti-mouse immunoglobulin antibody labeled with an 
enzyme, a chemical luminous substance, a radioactive 
substance or the like as a second antibody for reaction 
suitable for the labeled substance. A hybridoma which 
specifically reacts with the polypeptide of the present 
invention is selected as a hybridoma capable of producing a 
monoclonal antibody of the present invention. 
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Cloning is repeated using the hybridoma twice by 
limiting dilution analysis (HT medium (a medium in which 
aminopterin has been removed from HAT medium) is firstly 
used, and the normal medium is secondly used) , and a 
hybridoma which is stable and contains a sufficient amount 
of antibody titer is selected as a hybridoma capable of 
producing a monoclonal antibody of the present invention. 

(d) Preparation of monoclonal antibody 

The monoclonal antibody-producing hybridoma cells 
obtained in (c) are injected intraperi toneally into 8- to 
10-week-old mice or nude mice treated with pristane 
(intraperitoneal administration of 0.5 ml of 
2 , 6, 10, 14-tetramethylpentadecane (pristane), followed by 2 
weeks of feeding) at 5xl0 6 to 20xl0 6 cells/animal. The 
hybridoma causes ascites tumor in 10 to 21 days. 

The ascitic fluid is collected from the mice or 
nude mice, and centrifuged to remove solid contents at 3000 
rpm for 5 minutes. 

A monoclonal antibody can be purified and isolated 
from the resulting supernatant according to the method 
similar to that used in the polyclonal antibody. 

The subclass of the antibody can be determined 
using a mouse monoclonal antibody typing kit or a rat 
monoclonal antibody typing kit. The polypeptide amount can 
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be determined by the Lowry method or by calculation based 
on the absorbance at 280 nm. 

The antibody obtained in the above is within the 
scope of the antibody of the present invention. 

The antibody can be used for the general assay 
using an antibody, such as a radioactive material labeled 
immunoassay (RIA) , competitive binding assay, an 
immunotissue chemical staining method (ABC method, CSA 
method, etc.), immunoprecipi tation , Western blotting, ELISA 
assay, and the like (An Introduction to Radioimmunoassay 
and Related Techniques , Elsevier Science (1986) ; Techniques 
In Immunocytochemlstry , Academic Press, Vol. 1 (1982), 
Vol. 2 (1983) & Vol. 3 (1985); Practice and Theory of 
Enzyme Immunoassays , Elsevier Science (1985) ; Enzyme-linked 
Immunosorbent Assay (ELISA) , Igaku Shoin (1976) ; 
Antibodies - A Laboratory Manual, Cold Spring Harbor 
laboratory (1988) ; Monoclonal Antibody Experiment Manual, 
Kodansha Scientific (1987) ; Second Series Biochemical 
Experiment Course, Vol. 5, Immunobi o chemi s try Research 
Method, Tokyo Kagaku Dojin (1986)). 

The antibody of the present invention can be used 
as it is or after being labeled with a label. 

Examples of the label include radioisotope, an 
affinity label (e.g., biotin, avidin, or the like), an 
enzyme label (e.g., horseradish peroxidase, alkaline 
phosphatase, or the like), a fluorescence label (e.g., FITC, 
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rhodamine, or the like) , a label using a rhodamine atom, (J". 
Hlstochem. Cytochem. , IB: 315 (1970) ; Meth. Enzym. , 62 : 308 
(1979); Immunol. , 109: 129 (1972); J. Immunol. , Meth., 
13: 215 (1979)), and the like. 

Expression of the polypeptide of" the present 
invention, fluctuation of the expression, the presence or 
absence of structural change of the polypeptide, and the 
presence or absence in an organism other than coryneform 
bacteria of a polypeptide corresponding to the polypeptide 
can be analyzed using the antibody or the labeled antibody 
by the above assay, or a polypeptide array or proteome 
analysis described below. 

Furthermore, the polypeptide recognized by the 
antibody can be purified by immunoaf f ini ty chromatography 
using the antibody of the present invention. 

12 . Production and use of polypeptide array 
(1) Production of polypeptide array 

A polypeptide array can be produced using the 
polypeptide of the present invention obtained in the above 
item 10 or the antibody of the present invention obtained 
in the above item 11. 

The polypeptide array of the present invention 
includes protein chips, and comprises a solid support and 
the polypeptide or antibody of the present invention 
adhered to the surface of the solid support. 



Examples of the solid support include plastic such 
as polycarbonate or the like; an acrylic resin, such as 
polyacrylamide or the like; complex carbohydrates, such as 
agarose, sepharose, or the like; silica; a silica-based 
material, carbon, a metal, inorganic glass, latex beads, 
and the like. 

The polypeptides or antibodies according to the 
present invention can be adhered to the surface of the 
solid support according to the method described in 
Biotechniques , 27: 1258-61 (1999) ; Molecular Medicine Today, 
5: 326-7 (1999); Handbook of Experimental Immunology, 4th 
edition, Blackwell Scientific Publications, Chapter 10 
(1986) ; Meth. Enzym. , 34 (1974) ; Advances in Experimental 
Medicine and Biology, 42 (1974); U.S. Patent 4 , 681 , 870 ; U.S. 
Patent 4,282,287; U.S. Patent 4,762,881, or the like. 

The analysis described herein can be efficiently 
performed by adhering the polypeptide or antibody of the 
present invention to the solid support at a high density, 
though a high fixation density is not always necessary. 

(2) Use of polypeptide array 

A polypeptide or a compound capable of binding to 
and interacting with the polypeptides of the present 
invention adhered to the array can be identified using the 
polypeptide array to which the polypeptides of the present 



invention have been adhered thereto as described in the 
above (1) . 

Specifically, a polypeptide or a compound capable 
of binding to and interacting with the polypeptides of the 
present invention can be identified by subjecting the 
polypeptides of the present invention to the following 
steps (i) to (iv) : 

(i) preparing a polypeptide array having the 
polypeptide of the present invention adhered thereto by the 
method of the above (1) ; 

(ii) incubating the polypeptide immobilized on the 
polypeptide array together with at least one of a second 
polypeptide or compound; 

(iii) detecting any complex formed between the at least 
one of a second polypeptide or compound and the polypeptide 
immobilized on the array using, for example, a label bound 
to the at least one of a second polypeptide or compound, or 
a secondary label which specifically binds to the complex 
or to a component of the complex after unbound material has 
been removed ; and 

(iv) analyzing the detection data. 

Specific examples of the polypeptide array to which 
the polypeptide of the present invention has been adhered 
include a polypeptide array containing a solid support to 
which at least one of a polypeptide containing an amino 
acid sequence selected from SEQ ID NOS:3502 to 7001, a 
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polypeptide containing an amino acid sequence in which at 
least one amino acids is deleted, replaced, inserted or 
added in the amino acid sequence of" the polypeptide and 
having subs tan ti ally the same activity as that of the 
polypeptide, a polypeptide containing an amino acid 
sequence having a homology of 60% or more with the amino 
acid sequences of the polypeptide and having substantially 
the same activity as that of the polypeptides, a partial 
fragment polypeptide, and a peptide comprising an amino 
acid sequence of a part of a polypeptide . 

The amount of production of a polypeptide derived 
from coryneform bacteria can be analyzed using a 
polypeptide array to which the antibody of the present 
invention has been adhered in the above (1) . 

Speci f i cal ly , the expres s i on amount of a gene 
derived from a mutant of coryneform bacteria can be 
analyzed by subjecting the gene to the following steps (i) 
to (iv) : 

(i) preparing a polypeptide array by the method of the 
above (1) ; 

(ii) incubating the polypeptide array (the first 
antibody) together with a polypeptide derived from a mutant 
of coryneform bacteria; 

(iii) detecting the polypeptide bound to the polypeptide 
immobilized on the array using a labeled second antibody of 
the present invention ; and 



(iv) analyzing the detection data. 

Specific examples of* the polypeptide array to which 
the antibody of the present invention is adhered include a 
polypeptide array comprising a solid support to which at 
least one of an antibody which recognizes a polypeptide 
comprising an amino acid sequence selected from SEQ ID 
NOS:3502 to 7001, a polypeptide comprising an amino acid 
sequence in which at least one amino acids is deleted, 
replaced, inserted or added in the amino acid sequence of 
the polypeptide and having substantially the same activity 
as that of the polypeptide, a polypeptide comprising an 
amino acid sequence having a homology of 60% or more with 
the amino acid sequences of the polypeptide and having 
substantially the same activity as that of the polypeptides, 
a partial fragment polypeptide, or a peptide comprising an 
amino acid sequence of a part of a polypeptide. 

A fluctuation in an expression amount of a specific 
polypeptide can be monitored using a polypeptide obtained 
in the time course of culture as the polypeptide derived 
from coryneform bacteria. The culturing conditions can be 
optimized by analyzing the fluctuation. 

When a polypeptide derived from a mutant of 
coryneform bacteria is used, a mutated polypeptide can be 
detected . 
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13. Identification of useful mutation in mutant by proteome 
analysis 

Usually,, the pr oteiome is used herein to refer to a 
method wherein a polypeptide is separated by two- 
dimensional electrophoresis and the separated polypeptide 
is digested with an enzyme, followed by identification of 
the polypeptide using a mass spectrometer (MS) and 
searching a data base. 

The two dimensional electrophoresis means an 
electrophoretic method which is performed by combining two 
electrophoretic procedures having different principles. 
For example, polypeptides are separated depending on 
molecular weight in the primary electrophoresis. Next, the 
gel is rotated by 90° or 180° and the secondary 
electrophoresis is carried out depending on isoelectric 
point. Thus, various separation patterns can be achieved 
(JIS K 3600 2474) . ■ * 

In searching the data base, the amino acid sequence 
information of the polypeptides of the present invention 
and the recording medium of the present invention provide 
for in the above items 2 and 8 can be used. 

The proteome analysis of a corynefortn bacterium and 
its mutant makes it possible to identify a polypeptide 
showing a fluctuation therebetween . 

The proteome analysis of a wild type strain of 
coryneforxn bacteria and a production strain showing an 
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improved productivity of a target product makes it possible 
to efficiently identify a mutation protein which is useful 
in breeding for improving the productivity of a target 
product or a protein of which expression amount is 
fluctuated . 

Specifically, a wild type strain of coryneform 
bacteria and a lysine-producing strain thereof are each 
subjected to the proteome analysis. Then, a spot increased 
in the lysine-producing strain, compared with the wild type 
strain, is found and a data base is searched so that a 
polypeptide showing an increase in yield in accordance with 
an increase in the lysine productivity can be identified. 
For example, as a result of the proteome analysis on a wild 
type strain and a lysine-producing strain, the productivity 
of the catalase having the amino acid sequence represented 
by SEQ ID NO: 3785 is increased in the lysine-producing 
mutant . 

As a result that a protein having a high expression 
level is identified by proteome analysis using the 
nucleotide sequence information and the amino acid sequence 
information, of the genome of the coryneform bacteria of 
the present invention, and a recording medium storing the 
sequences, the nucleotide sequence of the gene encoding 
this protein and the nucleotide sequence in the upstream 
thereof can be searched at the same time, and thus, a 
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nucleotide sequence having a high expression promoter can 
be efficiently selected. 

In the proteome analysis, a spot on the two- 
dimentional electrophoresis gel showing a fluctuation is 
sometimes derived from a modified protein. However, the 
modified protein can be efficiently identified using the 
recording medium storing the nucleotide sequence 
information, the amino acid sequence information, of the 
genome of coryneform bacteria, and the recording medium 
storing the sequences, according to the present invention. 

Moreover, a useful mutation point in a useful 
mutant can be easily specified by searching a nucleotide 
sequence (nucleotide sequence of promoters, ORF, or the 
like) relating to the thus identified protein using a 
recording medium storing the nucleotide sequence 
information and the amino acid sequence information, of the 
genome of coryneform bacteria of the present invention, and 
a recording medium storing the sequences and using a primer 
designed on the basis of the detected nucleotide sequence. 
As a result that the useful mutation point is specified, an 
industrially useful mutant having the useful mutation or 
other useful mutation derived therefrom can be easily bred. 

The present invention will be explained in detail 
below based on Examples. However, the present invention is 
not limited thereto. 
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Example 1 

Determination of the full nucleotide sequence of genome of 
Corynehacterlum glutamlcum 

The full nucleotide sequence of the genome of 
Coryneha. cterl um glutamlcum was determined based on the 
whole genome shotgun method (Science, 269: 496-512 (1995)). 
In this method, a genome library was prepared and the 
terminal sequences were determined at random. 
Subsequently, these sequences were li gated on a computer to 
cover the full genome. Specifically, the following 

procedure was carried out. 

(1) Preparation of genome DNA of Corynebacterlum glutamlcum 
ATCC 13032 

Corynebacterlum glutamlcum ATCC 13032 was cultured 
in BY medium (7 g/1 meat extract, 10 g/1 peptone, 3 g/1 
sodium chloride, 5 g/1 yeast extract, pH 7.2) containing 1% 
of glycine at 30°C overnight and the cells were collected 
by centrifugation. After washing with STE buffer (10.3% 
sucrose, 25 mmol/1 Tris hydrochloride, 25 mmol/1 EDTA, pH 
8.0), the cells were suspended in 10 ml of STE buffer 
containing 10 mg/ml lysozyme, followed by gently shaking at 
37°C for 1 hour. Then, 2 ml of 10% SDS was added thereto 
to lyse the cells, and the resultant mixture was maintained 
at 65°C for 10 minutes and then cooled to room temperature. 
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Then, 10 ml of Tris-neutralized phenol was added thereto, 
followed by gently shaking at room temperature for 30 
minutes and centrifugation (15,000 x g, 20 minutes, 20°C) . 
The aqueous layer was separated and subjected to extraction 
with phenol /chloroform and extraction with chloroform 
(twice) in the same manner. To the aqueous layer, 3 mol/1 
sodium acetate solution (pH 5.2) and isopropanol were added 
at 1/10 times volume and twice volume, respectively, 
followed by gently stirring to precipitate the genome DNA. 
The genome DNA was dissolved again in 3 ml of TE buffer (10 
mmol/1 Tris hydrochloride, 1 mmol/1 EDTA, pH 8.0) 
containing 0 . 02 mg/ml of RNase and maintained at 37°C for 
45 minutes. The extractions with phenol, phenol/chloroform 
and chloroform were carried out successively in the same 
manner as the above. The genome DNA was subjected to 
isopropanol precipitation. The thus formed genome DNA 
precipitate was washed with 70% ethanol three times, 
followed by air-drying, and dissolved in 1.25 ml of TE 
buffer to give a genome DNA solution (concentration: 0.1 
mg/ml) . 

(2) Construction of a shotgun library 

TE buffer was added to 0.01 mg of the thus prepared 
genome DNA of Corynebacterlum glutamlcum ATCC 13032 to give 
a total volume of 0.4 ml, and the mixture was treated with 
a sonicator (Yamato Powersonic Model 150) at an output of 



20 continuously for 5 seconds to obtain fragments of 1 to 
10 kb. The genome fragments were blunt-ended using a DNA 
blunting kit (manufactured by Takara Shuzo) and then 
fractionated by 6% polyacryl amide gel electrophoresis. 
Genome fragments of 1 to 2 kb were cut out from the gel, 
and 0.3 ml MG elution buffer (0.5 mol/1 ammonium acetate, 
10 mmol/1 magnesium acetate, 1 mmol/1 EDTA, 0.1% SDS) was 
added thereto, followed by shaking at 37°C overnight to 
elute DNA. The DNA eluate was treated with 

phenol /chloroform, and then precipitated with ethanol to 
obtain a genome library insert. The total insert and 500 
ng of pUC18 Smal/BAP (manufactured by Amersham Pharmacia 
Biotech) were ligated at 16°C for 40 hours. 

The ligation product was precipitated with ethanol 
and dissolved in 0.01 ml of TE buffer. The ligation 
solution (0.001 ml) was introduced into 0 . 04 ml of E. coll 
FJ.F.r.TRO MAX DH10B (manufactured by Life Technologies) by 
the electroporation under conditions according to the 
manufacture's instructions. The mixture was spread on LB 
plate medium (LB medium (10 g/1 bactotrypton , 5 g/1 yeast 
extract, 10 g/1 sodium chloride, pH 7.0) containing 1.6% of 
agar) containing 0.1 mg/ml ampicillin, 0.1 mg/ml X-gal and 
1 mmol/1 isopropyl-p-D-thiogalactopyranoside (IPTG) and 
cultured at 37°C overnight. 

The transformant obtained from colonies formed on 
the plate medium was stationarily cultured in a 96-well 
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titer plate having 0.05 ml of LB medium containing 0.1 
mg/ml ampicillin at 37°C overnight. Then, 0.05 ml of LB 
medium containing 20% glycerol was added thereto , followed 
by stirring to obtain a glycerol stock. 

(3) Construction of cosmid library 

About 0 . 1 mg of the genome DNA of Ccrynebac terium 
g-lutamicuzn ATCC 13032 was partially digested with Sau3AI 
(manufactured by Takara Shuzo) and then ul tracentrifuged 
(26,000 rpm, 18 hours, 20°C) under 10 to 40% sucrose 
density gradient obtained using 10% and 40% sucrose buffers 
(1 mol/1 NaCl, 20 mmol/1 Tris hydrochloride, 5 mmol/1 EDTA, 
10% or 40% sucrose, pH 8.0). After the centrif ugation , the 
solution thus separated was fractionated into tubes at 1 ml 
in each tube. After confirming the DNA fragment length of 
each fraction by agarose gel electrophoresis, a fraction 
containing a large amount of DNA fragment of about 4 0 kb 
was precipitated with ethanol . 

The DNA fragment was ligated to the BamHI site of 
superCosl (manufactured by Stratagene) in accordance with 
the manufacture's instructions. The ligation product was 
incorporated into Escherichia coll XL-l-BlueMR strain 
(manufactured by Stratagene) using Gigapack III Gold 
Packaging Extract (manufactured by Stratagene) in 
accordance with the manufacture's instructions. The 
Escherichia coll was spread on LB plate medium containing 



0.1 mg/ml ampicillin and cultured therein at 37°c overnight 
to isolate colonies. The resulting^ colonies were 

stationarily cultured at 37°C overnight in a 96-well titer 
plate containing 0 . 05 ml of the LB medium containing 0 . 1 
mg/ml ampicillin in each well. LB medium containing 20% 
glycerol (0.05 ml) was added thereto, followed by stirring 
to obtain a glycerol stock. 

(4) Determination of nucleotide sequence 

(4-1) Preparation of template 

The full nucleotide sequence of Corynebacterium 
glutamxaum ATCC 13032 was determined mainly based on the 
whole genome shotgun method. The template used in the 
whole genome shotgun method was prepared by the PCR method 
using the library prepared in the above (2) . 

Specifically, the clone derived from the whole 
genome shotgun library was inoculated using a replicator 

(manufactured by GENETTX) into each well of a 96-well plate 
containing the LB medium containing 0.1 mg/ml of ampicillin 
at 0.08 ml per each well and then stationarily cultured at 
37°C overnight. 

Next, the culturing solution was transported using 
a copy plate (manufactured by Tokken) into a 96-well 
reaction plate (manufactured by PE Biosys terns) containing a 
PCR reaction solution (TaKaRa Ex Taq (manufactured by 
Takara Shuzo) ) at 0.08 nil per each well. Then, PCR was 
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carried cut in accordance with the protocol by MaJcino et al. 
IDNA Research, 5: 1-9 (1999)) using GeneAmp PCR System 9700 
(manufactured by PE Biosystems) to amplify the inserted 
fragment. 

The excessive primers and nucleotides were 
eliminated using a kit for purifying a PCR production 
(manufactured by Amersham Pharmacia Biotech) and the 
residue was used as the template in the sequencing reaction. 

Some nucleotide sequences were determined using a 
double- stranded DMA plasmid as a template. 

The double- stranded DNA plasmid as the template was 
obtained by the following method. 

The clone derived from the whole -genome shotgun 
library was inoculated into a 24- or 96-well plate 
containing a 2x YT medium (16 g/1 bactotrypton , 10 g/1 
yeast extract, 5 g/1 sodium chloride, pH 7.0) containing 
0.05 mg/ml ampicillin at 1.5 ml per each well and then 
cultured under shaking at 37°C overnight. 

The double-stranded DNA plasmid was prepared from 
the culturing solution using an automatic plasmid preparing 
machine, KURABO PI -50 (manufactured by Kurabo Industries) 
or a multiscreen (manufactured by Millipore) in accordance 
with the protocol provided by the manufacturer. 

To purify the double-stranded DMA plasmid using the 
multiscreen, Biomek 2000 -(manufactured by Bedcman Coulter) 
or the like was employed. 
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The thus obtained double- stranded DNA plasmid was 
dissolved in water to give a concentration of about 0.1 
mg/ml and used as the template in sequencing. 

(4-2) Sequencing reaction 

To 6 p.1 of a solution of ABI PRISM BigDye 
Terminator Cycle Sequencing Ready Reaction Kit 
(manufactured by PE Biosystems) , an M13 regular direction 
primer (M13-21) or an M13 reverse direction primer (M13REV) 
(DNA Research, 5: 1-9 (1998) and the template prepared in 
the above (4-1) (the PCR product or the plasmid) were added 
to give 10 |j.l of a sequencing reaction solution. The 
primers and the templates were used in an amount of 1 . 6 
pmol and an amount of 50 to 200 ng, respectively. 

Dye terminator sequencing reaction of 45 cycles was 
carried out with GeneAmp PCR System 9700 (manufactured by 
PE Biosystems) using the reaction solution. The cycle 
parameter was determined in accordance with the 
manufacturer's instruction accompanying ABI PRISM BigDye 
Terminator Cycle Sequencing Ready Reaction Kit. The sample 
was purified using Multiscreen HV plate (manufactured by 
Millipore) according to the manufacture's instructions. The 
thus purified reaction product was precipitated with 
ethanol ; followed by drying, and then stored in the dark - at 
-30°C. 



The dry reaction product was analyzed by ABI PRISM 
377 DNA Sequencer and ABI PRISM 3700 DNA Analyzer (both 
manufactured by PE Biosys terns) each in accordance with the 
manufacture ' s instructions . 

The data of about 50,000 sequences in total (i.e., 
about 42,000 sequences obtained using 377 DNA Sequencer and 
about 8,000 reactions obtained by 3700 DNA Analyser) were 
transferred to a server (Alpha Server 4100: manufactured by 
COMPAQ) and stored. The data of these about 50,000 
sequences corresponded to 6 times as much as the genome 
size . 

(5) Assembly 

All operations were carried out on the basis of 
UNIX platform. The analytical data were output in 

Macintosh platform using X Window System. The base call 
was carried out using phred (The University of Washington) . 
The vector sequence data was deleted using SPS Cross_Match 

(manufactured by Southwest Parallel Software) . The 
assembly was carried out using SPS phrap (manufactured by 
Southwest Parallel Software; a high-speed version of phrap 

(The University of Washington) ) . The con tig obtained by 
the assembly was analyzed using a graphical editor, consed 

(The University of Washington) . A series of the operations 
from the base call to the assembly were carried out 
simultaneously using a script phredPhrap attached to consed 
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(6) Determination of nucleotide sequence in gap part 

Each cosmid in the cosmid library constructed in 
the above (3) was prepared by a method similar to the 
preparation of" the double- stranded DNA plasmid described in 
the above (4-1) . The nucleotide sequence at the end of the 
inserted fragment of the cosmid was determined by using ABI 
PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit 

(manufactured by PE Biosys terns) according to the 
manufacture ' s instructions . 

About 800 cosmid clones were sequenced at both ends 
to search a nucleotide sequence in the contig derived from 
the shotgun sequencing obtained in the above (5) coincident 
with the sequence. Thus, the linkage between respective 
cosmid clones and respective contigs were determined and 
mutual alignment was carried out. Furthermore, the results 
were compared with the physical map of Cor*yneJbactejrium 
glutamlcum ATCC 13032 (Mol. Gen. Genet., 252: 255-265 

(1996) to carrying out mapping between the cosmids and the 
contigs . 

The sequence in the region which was not covered 
with the contigs was determined by the following method. 

Clones containing sequences positioned at the ends 
of contigs were selected. Among these clones, about 1,000 
clones wherein only one end of the inserted fragment had 
been determined were selected and the sequence at the 
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opposite end of the inserted fragment was determined. A 
shotgun library clone or a cosmid clone containing the 
sequences at the respective ends of the inserted fragment 
in two con tigs was identified, the full nucleotide sequence 
of the inserted fragment of this clone was determined, and 
thus the nucleotide sequence of the gap part was determined. 
When no shotgun library clone or cosmid clone covering the 
gap part was available, primers complementary to the end 
sequences at the two contigs were prepared and the DNA 
fragment in the gap part was amplified by PCR. Then, 
sequencing was performed by the primer walking method using 
the amplified DNA fragment as a template or by the shotgun 
method in which the sequence of a shotgun clone prepared 
from the amplified DNA fragment was determined. Thus, the 
nucleotide sequence of the domain was determined. 

In a region showing a low sequence precision, 
primers were synthesized using AUTOFINISH function and 
NAVIGATING function of consed (The University of 
Washington) and the sequence was determined by the primer 
walking method to improve the sequence precision. The thus 
determined full nucleotide sequence of the genome of 
Corynebacterlum glutamlcum ATCC 13032 strain is shown in 
SEQ ID NO:l. 



(7) Identification of ORF and presumption of its function 

ORFs in the nucleotide sequence represented by SEQ 

ID NO:l were identified According to the following method. 

First, the ORF regions were determined using software for 

identifying ORF, i.e., Glimmer, GeneMark and GeneMark. hmm 

on UNIX platform according to the respective manual 

attached to the software* 

Based on the data thus obtained, ORFs in the 

nucleotide sequence represented by SEQ ID NO : 1 were 

identified. 

The putative function of an ORF was determined by 
searching the homology of the identified amino acid 
sequence of the ORF - against an amino acid database 
consisting of protein-encoding domains derived from Swiss- 
Prot, PIR or Genpept database constituted by protein 
encoding domains derived from GenBank database, Frame 
Search (manufactured by Compugen) , or by searching the 
homology of the identified amino acid sequence of the ORF 
against an amino acid database consisting of protein- 
encoding domains derived from Swiss-Prot, PIR or Genpept 
database constituted by protein encoding domains derived 
from GenBank database, BLAST. The nucleotide sequences of 
the thus determined ORFs are shown in SEQ ID NOS:2 to 3501, 
and the amino acid sequences encoded by these ORFs are 
shown in SEQ ID N0S:3502 to 7001. 
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In some cases of the sequence listings in the 
present invention , nucleotide sequences, such as TTG, TGT, 
GGT, and the like, other than ATG, are read as an 
initiating codon encoding Met. 

Also, the preferred nucleotide sequences are SEQ ID 
NOS:2 to 355 and 357 to 3501, and the preferred amino acid 
sequences are shown in SEQ ID NOS:3502 to 3855 and 3857 to 
7001 

Table 1 shows the registration numbers in the 
above-described databases of sequences which were judged as 
having the highest homology with the nucleotide sequences 
of the ORFs as the results of the homology search in. the 
amino acid sequences using the homology-searching software 
Frame Search (manufactured by Compugen) , names of the genes 
of these sequences , the functions of the genes, and the 
matched length, identities and analogies compared with 
publicly known amino acid- translation sequences. Moreover, 
the corresponding positions were confirmed via the 
alignment of the nucleotide sequence of ah arbitrary ORF 
with the nucleotide sequence of SEQ ID NO: 1 . Also, the 
positions of nucleotide sequences other than the ORFs (for 
example, ribosomal RNA genes, transfer RNA genes, IS 
sequences, and the like) on the genome were determined. 

Fig. 1 shows the positions of typical genes of the 
Corynebacterlum glutamicum' ATCC 13032 on the genome. 
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Function 


replication initiation protein DnaA 




DNA polymerase III beta chain 


DNA replication protein (recF 
protein) 


hypothetical protein 


DNA topoisomerase (ATP- 
i hydrolyzing) 










NAGC/XYLR repressor 






DNA gyrase subunit A 


hypothetical membrane protein 


hypothetical protein 


bacterial regulatory protein, LysR 
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cytochrome c biogenesis protein 


hypothetical protein 


repressor 
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CO 
CN 






o 


29.5 


33.7 


27.6 




29.1 


31.6 


36.8 


Homologous gene 


Brevibacterium flavum dnaA 




Mycobacterium smegmatis dnaN 


Mycobacterium smegmatis recF 


Streptomyces coelicolor yreG 


Mycobacterium tuberculosis 
H37Rv gyrB 










Mycobacterium tuberculosis 
H37Rv 






Mycobacterium tuberculosis 
H37Rv Rv0006 gyrA 


Mycobacterium tuberculosis 
H37Rv Rv0007 


Escherichia coli K12yeiH 


Hydrogenophilus thermoluteolus 
TH-1 cbbR 




Rhodobacter capsulatus ccdA 


Coxiella burnetii com1 


Mycobacterium tuberculosis 
H37Rv Rv1846c 


db Match 


gsp:R98523 




sp:DP3B_MYCSM 


sp:RECF_MYCSM 


sp:YREG_STRCO 
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sp:YV11_MYCTU 
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14746 


15209 
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17670 
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(nt) 
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7830 
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11831 
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(a.a.) 
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3510 
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Function 


hypothetical membrane protein 


2,5-diketo-D-gluconic acid reductase 


5-nucieotidase precursor 


5-nucleotidase family protein 


transposase 


organic hydroperoxide detoxication 
enzyme 


ATP-dependent DNA helicase 




glucan 1,4-alpha-glucosidase 


lipoprotein 


ABC 3 transport family or integral 
membrane protein 


iron(lll) dicitrate transport ATP- 
biding protein 


sugar ABC transporter, periplasmic 
sugar-binding protein 


high affinity ribose transport protein 


ribose transport ATP-binding protein 


neurofilament subunit NF-180 


peptidyl-prolyl cis-trans isomerase A 


hypothetical membrane protein 


Matched 
length 
(aa) 
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52.9 
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28.9 
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39.2 
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23.6 
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Homologous gene 


Mycobacterium leprae 
MLCB1788.18 


Corynebacterium sp. ATCC 
31090 


Vibrio parahaemolyticus nutA 


Deinococcus radiodurans 
DR0505 


Corynebacterium striatum ORF1 


Xanthomonas campestris 
phaseoli ohr 


Thiobacillus ferrooxidans recG 




Saccharomyces cerevisiae 
S288C YIR019C stal 


Erysipelothrix rhusiopathiae 
ewlA 


Streptococcus pyogenes SF370 
mtsC 


Escherichia coli K12 fecE 


Thermotoga maritima MSB8 
TM0114 


Escherichia coli K12 rbsC 


Bacillus subtilis 168 rbsA 


Petromyzon marinus 


Mycobacterium leprae H37RV 
RV0009 ppiA 


Bacillus subtilis 168 yqgP 


db Match 


gp:MLCB1788_6 


pir:l40838 


sp:5NTD_VIBPA 


gp:AE001909_7 


prf:2513302C 


prf:2413353A 


|sp:RECG_THIFE 




sp:AMYH_YEAST 
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Function 




NADP-dependent alcohol 
dehydrogenase 


glucose-1-phosphate 
thymidylyltransferase 


dTDP-4-keto-L-rhamnose reductase 


dTDP-glucose 4,6-dehydratase 


NADH dehydrogenase 


Fe-regulated protein 




hypothetical membrane protein 


metallopeptidase 


prolyl endopeptidase 




hypothetical membrane protein 


cell surface layer protein 


autophosphorylating protein Tyr 
kinase 


protein phosphatase 




capsular polysaccharide 
biosynthesis 


ORF 3 


lipopolysaccharide biosynthesis / 
aminotransferase 


CD -C ^ 
CO CD — 




CO 
CO 


LO 
CO 
CN 


CN 
CO 


CO 

T 

CO 


CD 

o 

CN 


LO 
CN 
CO 




CO 
CN 
^1" 


CD 


CO 

o 




CO 
LO 
CN 


CO 
CD 
CO 


CO 
LO 


CN 

o 




CO 
CO 


o 

CO 


co 

CO 












































Similarity 
(%) 






CO 


o 


■<T 


CN 


LO 




CO 


CO 






O 


CD 


CN 


CD 






o 


CO 




h- 


CO 




CO 
CO 


r — 
CO 


CD 
CO 




CO 
CO 


CN 
CO 


CO 
LO 




CD 


CO 


LO 


CO 
CO 




LO 
CO 


LO 


CO 
CD 






CN 


CO 


to 


CO 




CN 




NT 




^ 




o 




LO 


CN 




o 


o 








CN 
LO 


CN 
CD 


CO 


CD 


LO 
CO 


CO 
CO 




1^ 
CO 


CO 


CO 
CN 




CO 
CN 


o 

LO 


CO 
CN 


CO 
CO 




CO 
CO 


? 


CO 


Homologous gene 




Mycobacterium tuberculosis 
H37RvadhC 


Salmonella anatum M32 rfbA 


Streptococcus mutans rmIC 


Streptococcus mutans XC rmlB 


Thermus aquaticus HB8 nox 


! Staphylococcus aureus sirA 




Mycobacterium tuberculosis 
H37Rv Rv3630 


Streptomyces coelicolor 
SC5F2A.19C 


Sphingomonas capsulata 




Streptomyces coelicolor A3(2) 


Corynebacterium 
ammoniagenes ATCC 6872 


Acinetobacter johnsonii ptk 


Acinetobacter johnsonii ptp 




Staphylococcus aureus M capD 


Vibrio cholerae 


Campylobacter jejuni wlaK 


db Match 




sp:ADH_MYCTU 


sp:RFBA_SALAN 


LO 

CM 1 

CO 

CO 

Q 

CL 
O) 


sp:RMLB_STRMU 


sp:NOX_THETH 


!prf:2510361A 




r- 

o 
> 

1 

:> 
> 

CL 
CO 


gp:SC5F2AJ9 


prf:2502226A 




CO 

LL> 

O 
to 
cL 
co 


gsp:W56155 


prf:2404346B 


prf:2404346A 




sp:CAPD_STAAU 


PRF:2109288X 


prf:2423410L 


LL. ~ 


CO 
CO 


1059 


LO 
LO 
CO 


1359 


1131 


CO 

r- 

LO 


LO 

co 


CO 
CO 
CO 


1308 


1380 


2118 


CO 
h-- 
LO 


1092 


1095 


1434 


CO 

o 

CD 


co 

CO 


1812 


CN 
CO 


1155 


Terminal 
(nt) 


346110 


346961 


348098 


348952 


350313 


351370 


353637 


353749 


354599 


355849 


357237 


359762 


360814 


362057 


365257 


365852 


366838 


368643 


367701 


369801 


Initial 
(nt) 


346460 


348019 


348952 


350310 


351443 


351948 


352693 


354387 


355906 


357228 


359354 


360334 


361905 


363151 


363824 


365250 


365855 


366832 


368642 


368647 


SEQ 
NO. 
(a.a.) 


3869 


3870 


3871 


3872 


3873 


3874 


3875 


3876 


3877 


3878 


3879 


3880 


3881 : 


3882 


3883 


3884 


3885 


3886 


3887 


3888 


SEQ 
NO. 
(DNA) 


ty> 

CD 
CO 


o 
co 


r— 

CO 


CN 

co 


CO 
CO 


co 


LO 
CO 


CD 
r*- 
CO 


r*- 

CO 


CO 

co 


CO 

h- 
co 


o 

CO 
CO 


CO 

CO 


CN 
CO 
CO 


CO 
CO 
CO 


CO 
CO 


LO 
CO 
CO 


CD 
CO 
CO 


CO 
CO 


CO 
CO 
CO 
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c 
o 



!U 

m 



m 



its 



J5 -vp 

I- 

CO 



CD 



c 
o 
o 



.22 



0) 
CD 
CO 

O 
CD 

O 

E 
o 
X 



II 



E 

CO 



o n - < 
co ^ Q 



co 



0) 
O- 

c 
o 

CO " CO 
O °- CD 
o *- ^ 

i If 



O 
CO 
CO 



CD 
CO 



CO 
CO 



CO 



CO 

CD 



O 
O 

_o 



CO 



CO 



Q_ 
< 



CO 



CO 
CO 



CD 

o 

CO 



o 

CD 
CO 
CO 



o 

CD 
CO 



CO 

o 
la 

CO 
"D 



O m 

TO p 

"6 ti 

Q_ O 

O Q- 

CL X 

= CD 



"3" 
O 

m 



ao 
cb 

CD 



CO 



E 

ZJ 
CD 



CD 
CL 



O 

E 
o 

CO 

X 



CD 
LO 
CO 
h~ 
CD 

CO 



CD 



CO 
CO 



CD 
CN 
CD 

CO 



CD 
CO 
CO 



O CD 

J! 

CD CO 
CD 

O .f= 
CO > 



d.2 

3 o 



CN 



CD 



CO 



cu 
co 
o 

CO 

o 



ra 
o 



O 



< 

or 



CO 

co 



o 
o 

LO 

CO 



CN 
CD 
CO 
CO 



CN 
CD 
CO 



E 

CO 



o 
> 

i_ 

"5 <1> 
i 3^ o 



CO 
CN 



CO 
CO 



CO 
CO 



CD 

I 

ZJ 

E 



CO 
CO 

J3 
o 

CO 

CD 



3 
CO 
O 
< 

m 
m 1 

CC 
3 



CL 
CO 



LO 
O 
O 



CO 
CO 
LO 

CO 



CO 
CO 
CO 

CO 



CO 
CD 
CO 
CO 



CO 
CD 
CO 



CU 

o 

£= 

co 

3 

cr 
<D 

CO 

c 
g 



CO _ 
CO 

o co 

CL QQ 
CO ~ 

£co 



o 



CD 



LO 



E 

(0 

-4— » 
.5 

CD 

E 



a> 

^ °o 

CO __ 
CU " 

era 

o h- 

o < 



CO 
CO 

co 



CN 
CO 



LO 
CO 

co 



LO 
CO 
r — 
CO 

r^- 
co 



CD 
CO 
CO 



CD 
CO 



0) 

o 

CL 

75 



o 



o 



LO 



CO 
CN 



CU 
_Q 
3 

O 

E 10 

§ CD 

.g LO 

5 > 
CO 

o > 

o QL 

o f^- 
CO 

^ X 



o 



o 



CO 
CD 
CD 
CO 

co 



co 

CO 
CD 

co 



CD 
CD 
CO 
CO 



CD 
CD 
CO 



CU 
CO 
CO 

a 

CO 

c 

CO 



LO 
CO 



o 

CD 



LO 

^* 
CO 



o 
c 

CD 
ZJ 
t— 
CU 
CO 
CO 
CO 

c: 
o 

E 
o 

"So 

cu _a 

CO to 
CL Q_ 



CN 
LO 
CO 

h- 
co 



CO 
CD 
CD 



O 
LO 
CO 
CD 



CO 
O 
CO 
CO 



O 
O 
CD 
CO 



O 
O 
^1" 



co 
c 

CO 
CD 

o 

T3 

a> 

CO 

a> 

CO 

o 
o 

CD 

I 

CL 
Q 
3 



CU 
CO 
CO 



o 
o 



c 

CO 

o 

CO 



CO 
CO 
CO 



CO 
CN 



CN 
CN 



CD 
CO 



LO 
CD 



CM 
CD 



CO 
CO 



CN 
CO 



CO 
CO 



E 

=3 
'C 
<D 

"o 

CO 
-Q 
CU 

c 

o 
O 



CD 
3 



O 

o 



UJ 



"o 
o 

(0 

x: 
o 

CD 

o 

CO 
LU 



LO 



o 

"i 

CD 

_c 
u 

CO 

UJ 



o 
a 

LU 
co' 

e> 

Q 

3 



CN 
CO 
CN 



< 

CL 
CD 



CD 

co 

CO 

o 
o 
m 

< 

CL 

cn 



co 

CN 



CO 
CN 



CD 

o 

CN 



CN 
CN 
CO 



LO 
CD 



LO 
CD 



LO 
CD 

CO 
CO 



CO 

o 

T— 

CO 
CO 
CO 



CO 
CD 
^3" 
CO 
CO 
CO 



CN 
CO 
CD 
CO 
CO 
CO 



o 
o 

CN 

CO 
CO 



CO 
CO 

co 

CO 



LO 

co 

CM 



co 

CO 



CO 
CD 

CO 
CO 



O 
CD 
x — 
LO 
CO 
CO 



LO 
CD 

CO 
CO 

CO 



CO 
LO 
LO 
CD 
CO 
CO 



LO 
CD 
f^- 
CO 
CO 



CM 
O 
CD 
CO 



CO 

o 

CO 
CO 



LO 

o 

CD 
CO 



CO 

o 

CD 
CO 



o 

CD 
CO 



CN 
O 



CO 
O 



lo 
o 



co 
o 



o 
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"O 
CD 
3 



Function 


dihydrolipoamide dehydrogenase 


UTP-glucose-1 -phosphate 
uridylyltransferase 


regulatory protein 


transcriptional regulator 


cytochrome b subunit 


succinate dehydrogenase 
flavoprotein 


succinate dehydrogenase subunit B 












hypothetical protein 


hypothetical protein 






tetracenomycin C transcription 
repressor 




transporter 


Matched 
length 
(aa) 


o> 

CD 

■^r 


LO 
CO 
CN 


CO 
LO 


T 


o 

CO 
CN 


oo 
o 

CD 


CO 

in 

CM 












CD 
LO 
CN 


CO 






r- 

CD 




CD 
CD 


Similarity 
(%) 


100.0 


68.1 


71.9 


81.3 


67.4 

i 


61.2 


56.2 












49.8 


64.3 






53.8 




74.6 


Identity 

(%) 


99.6 


41.7 


43.8 


57,0 


34.8 


32.4 


27.5 












26.3 


32.7 






26.4 




36.1 


Homologous gene 


Corynebacterium glutamicum 
ATCC 13032 Ipd 


Xanthomonas campestris 


Pseudomonas aeruginosa PA01 
orfX 


Mycobacterium tuberculosis 
H37Rv Rv0465c 


Streptomyces coelicolor A3(2) 
SCM10.12c 


Bacillus subtilis sdhA 

i 


Paenibacillus macerans sdhB 












Streptomyces coelicolor 
SCC78.05 


Escherichia coli K12yjiN 






Streptomyces glaucescens 
GLAOtcmR 




Streptomyces fradiae T#2717 
urdJ 


db Match 


gp:CGLPD_1 


pir:JC4985 


gp:PAU49666_2 


pir:E70828 


gp:SCM10J2 


pir:A27763 


gp:BMSDHCAB_4 












gp:SCC78_5 


sp:YJIN_ECOLI 






sp:TCMR_STRGA 




CO 

CO 
CD 

CO 

T — 

LL 
< 
CL 
CD 


St 


1407 


CN 
CD 


CO 
CD 


1422 


r-- 


1875 


CO 
CO 


CO 
CO 
CO 


CD 
CN 


O 
CO 
CD 


CD 
CD 


CD 
CO 
CO 


to 

CD 


1251 


o 

CN 
^J" 


CO 

o 

CO 


CO 
CD 


o 

CN 


1647 


Terminal 
(nt) 


389098 


390168 


390730 


390787 


393475 


395513 


396262 


396650 


, 396932 


396411 


397825 : 


398222 


397232 


399579 


400017 


400341 


401150 


401253 


402796 


Initial 
(nt) 


387692 


389248 


390233 


392208 


392705 


393639 


395426 


396315 


396672 


397040 


397730 


397884 


398206 


398329 


399598 


400039 


400473 


401050 


401150 


SEQ 

NO. 
(a.a.) 


3908 


3909 


3910 


3911 


3912 


3913 


3914 


3915 


3916 


3917 


3918 


3919 


3920 


3921 


3922 


3923 


3924 


3925 


3926 


SEQ 
NO. 
(DNA) 


CO 

o 


CO 

o 


o 


? 


CN 


CO 




LO 


CD 




CO 


CD 


o 

CN 


CN 


CN 
CN 


CO 
CN 


CN 


LO 
CN 


CO 
CN 
^- 



-150 



o 
o 



C CO 

CO <a -— - 



(0 
CO 



ill 



CD 



O 
O 



,co 



0) 
CO 
€/) 
ZJ 

o 
cn 
o 
o 

E 
o 
X 



o 



GO 

o 
lo 



CD 
CO 



E 
o 

§-=> 



T3 



E c 

- — - 

CD 



CO 



2§ 2 

CO Z ^ 



O r :< 
uu § z 

co ^ Q 



CD 
CD 



CL 



CN 
CO 
CD 



o 

CO 
T 

o 



CD 
CD 

CN 

o 



CN 
CD 
CO 



CN 



c 
o 

CL 



o 



o 

CO 
CN 



CD 
CO 
LO 



CO 
CN 



CD 
O 

CO CO 

-° 3: 

o CO 

o > 
>> CO 

S E 



co 

T — 

T 

CN 

±1 
CL 



CD 
CO 



o 



o 

LO 
LO 
CO 
O 



CN 
CO 
CO 
CO 



CN 
CO 



2 

CL 



o 

CL 



CN 
CO 



LO 
CO 



oo 

LO 



E ° 

£ > 

o 0C 
o r»- 
>^co 
S X 



o 

CO 

o 
< 



o 
o 

CO 



CO 

o 



CO 

o 
o 



CO 
CO 
CO 
CO 



CD 



LLI 



o 

o 

_=3 
CO 

I 

CO 
JZ 
_CL 

(0 



CD 
CN 
CD 



CO 
LO 



CN 



CD 
CO 
'</) 

> _ 

£ To 
8 - 

c/, O 

0) CO 

O x- 

^ o 

E or 

2 > 
5o 

o CO 

O CO 

CO CN 

CO CO 



CO 

< 



X 



< 

CL 



CO 
CD 
CO 



LO 
CN 



CO 
CO 
CO 

o 



I s -. 
co 

CO 
CO 



CO 
^3- 



c 

'<D 

"S 
k_ 
CL 

O 
E 

to 

JO 

Q_ 

CD 
CL 
CO 

c 



£ 
cu 



CO 
CO 



CO 
CO 



lo 



CL 

E 



CO 
XI 
CD 

ol 



CN 
CO 

CO 

o 

LL 
< 
CL 
CO 



o 



CO 
CO 
CO 
CO 



LO 
LO 
CN 



CO 

CO 
CO 
CO 



CO 
CO 





c 




CD 




o 




CL 




CO 




c 












— 
X) 




OL 








< 






CD 


CD 






o 


o 


CL 


CL 


Ol 


C/> 




c: 


CO 


CO 


I— 

-4—' 


' 


O 


O 


CD 


CD 


< 


< 



o 

CO 
CO 



o 

CO 



LO 

CO 



"o 
ro 

JQ 
CD 

SI 



CN 
CD 



CO 
O 



CL 
CO 



CO 
CO 
O 



CD 
CO 



CO 
CO 
CO 
CO 



CO 
CO 



LO 

CN 



LO 
CO 



CO 
CD 



XJ 



O 
CO 
-Q 
CD 

31 



CN 
CD 



CO 
O 



CL 
CO 



CO 
CO 



CD 
CN 
LO 
LO 



o 
co 

CO 



o 



c 

CD 

"o 

CL 

75 
o 

oS 
£j 
o 

CL 



CD 
CO 
CN 



CD 
LO 



CO 
CO 
CN 



< 
LO 

O 

o 

CO 



I s *- 

LO 
CO 



CO 
CO 
LO 

CO 



CO 

CO 
LO 



co 

CO 



CD 

o 



O 
CL 
>s 



CO 
LO 
CN 



CO 
CD 



CO 

CN 
CO 



C75A 


C75A 


or 


or 


icol 


icoi 


coel 


coel 


Streptomyces 
SCC75A.17c 


Streptomyces 
SCC75A.17c 



< 

LO 

a 
o 

CO 
a. 

CO 



co 

CO 



CO 
CO 



CO 

o 

CD 
CD 



CN 

co 

CO 



CM 



co 

GO 



O 
LO 



LO 
CN 
CO 



CO 
LO 

CN 
CO 



LO 

cr> 



LO 

co 

CO 



CO 



LO 
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Function 


UDP-N-acetylpyruvoylglucosamine 
reductase 








long-chain-fatty-acid-CoA ligase 


transferase 


phosphoglycerate mutase 


two-component system sensor 
histidine kinase 


two-component response regulator 




ABC transporter ATP-binding protein 


cytochrome P450 


exopolyphosphatase 


hypothetical membrane protein 


pyrroline-5-carboxylate reductase 


membrane glycoprotein 


hypothetical protein 




Matched 
length 
(a.a) 


CO 

LO 
CO 








CO 
LO 
LO 


CO 


CO 
CN 




CO 
CN 




CN 
CO 


CO 
CD 
CN 


CD 

o 

CO 


OJ 

o 

CO 


CO 
CD 

oi 


CO 
CO 


LO 

to 




milarity 
(%) 


58.4 








68.1 


58.7 


84.2 


CO 


CO 

o 

CO 




60.7 


66.9 


57.8 


57.3 


100.0 


52.0 


94.6 




to 






































Identity 
(%) 


30.1 








LO 

LO 
CO 


33.9 


70.7 : 


49.2 


75.8 




31.3 


45.0 


28.8 


28.8 


100.0 


25.4 


76.4 




Homologous gene 


Escherichia coli RDD012 murB 








Bacillus subtilis IcfA 


Streptomyces coelicolor 
SC2G5.06 


Streptomyces coelicolor A3(2) 
gpm 


Mycobacterium bovis senX3 


Mycobacterium bovis BCG 
regX3 




Streptomyces coelicolor A3(2) 
SCE25.30 


Mycobacterium tuberculosis 
H37Rv RV3121 


Pseudomonas aeruginosa ppx 


Mycobacterium tuberculosis 
H37Rv Rv0497 


Corynebacterium glutamicum 
ATCC 17965 proC 


Equine herpesvirus 1 ORF71 


Mycobacterium leprae 
B2168_C1J72 




db Match 


gp:ECOMURBA_1 








sp:LCFA_BACSU 


LO 

CD 

CN 

O 
CO 

CL 

cn 


O 
o 
cn 

\- 

CO 

I 

> 

QL 

CL 

cn 


prf:2404434A 


prf;2404434B 




gp:SCE25_30 


1— 
O 
> 

CN 
> 
> 
Cl 
co 


prf2512277A 


sp:YV23_MYCTU 


sp:PROC_CORGL 


gp:D88733J 


pir:S72921 






1101 


x— 

LO 
CO 


LO 
CO 

r-- 




1704 


1254 




1239 


CO 
CO 
CO 


CO 
h- 
CO 


2586 


CO 
O 
CO 


CN 
CO 


CO 
CO 


o 

CO 


1122 


CO 

CO 


CO 
CN 


Terminal 
(nt) 


420885 


421516 


420309 


422031 


422090 


425131 


I 425920 


427172 


427867 


429439 


429438 


432126 


433988 


434822 


435695 


433865 


436137 


436103 


Initial 
(nt) 


419785 


420866 


421043 


421858 


423793 


423878 


425177 


425934 


427172 


428561 


432023 


433028 


433062 


434010 


434886 


434986 


435940 


436321 


SEQ 
NO. 
(a.a.) 


3946 


3947 


3948 


3949 


3950 


3951 


3952 


3953 


3954 


3955 


3956 


3957 


3958 


3959 


3960 


3961 


3962 


3963 


SEQ 
NO. 
(DNA) 


CO 




CO 


CO 


o 

LO 
"3" 


LO 


CN 
LO 


CO 
LO 


LO 


LO 
LO 


CD 
LO 


LO 


CO 
LO 


CO 
LO 


o 

CO 


CD 


CN 
CO 


CO 
CD 
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Function 


hypothetical protein 






phosphoserine phosphatase 


hypothetical protein 




glutamyl-tRNA reductase 


hydroxymethylbilane synthase 




cat operon transcriptional regulator 


shikimate transport protein 


3-dehydroshikimate dehydratase 


shikimate dehydrogenase 




putrescine transport protein 




iron(lll)-transport system permease 
protein 




periplasmic-iron-binding protein 


uroporphyrin-lll C-methyltransferase 




Matched 
length 
(aa) 


CJ) 
CN 






CO 
CJ) 
CN 


t^- 




LO 
LO 


oo 
o 

CO 




CN 
CO 




CO 

o 

CO 


CN 
OO 
CN 




CO 
CO 
CO 




CO 

LO 




co 


CD 
OO 




milarity 
(%) 


100.0 






77.4 


66.2 




74.3 


75.3 




57.6 


72.2 


57.9 


98.6 




68.6 




55.2 




59.9 


71.6 




CO 












































Identity 
(%) 


89.7 






51.0 


40.5 




44.4 


50.7 




27.1 


35.5 


28.2 


98.2 




CO 




25.1 




25.1 


46.5 




Homologous gene 


Streptomyces coelicolor 
SCE68.25c 






Mycobacterium leprae 
MTCY20G9.32C. serB 


Mycobacterium tuberculosis 
H37Rv Rv0508 




Mycobacterium leprae hemA 


Mycobacterium leprae hem3b 




Acinetobacter calcoaceticus 
catM 


Escherichia coli K12 shiA 


Neurospora crassa qa4 


Corynebacterium glutamicum 
AS019aroE 




Escherichia coli K12 potG 




Serratia marcescens sfuB 
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Function 


hypothetical membrane protein 




transcriptional repressor 


hypothetical protein 




transcriptional regulator (Sir2 family) 


hypothetical protein 


iron-regulated lipoprotein precursor 


rRNA methylase 


methylenetetrahydrofolate 
dehydrogenase 


hypothetical membrane protein 


hypothetical protein 




homoserine O-acetyltransferase 


O-acetylhomoserine sulfhydrylase 


carbon starvation protein 




hypothetical protein 
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Homologous gene 


Streptomyces coelicolor A3(2) 
SCE9.01 




Mycobacterium tuberculosis 
H37Rv Rv2788 sirR 


Streptomyces coelicolor A3(2) 
SCG8A.05C 




Archaeoglobus fulgidus AF1676 


Streptomyces coelicolor A3(2) 
SC5H1.34 


Corynebacterium diphtheriae 
irp1 


Mycobacterium tuberculosis 
H37Rv Rv3366 spoU 


Mycobacterium tuberculosis 
H37Rv Rv3356c folD 


Mycobacterium leprae 
MLCB1779.16C 


Streptomyces coelicolor A3(2) 
SC66T3.18c 




Corynebacterium glutamicum 
metA 


Leptospira meyeri metY 


Escherichia coli K12 cstA 




Escherichia coli K12 yjiX 
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Function 




ferrichrome ABC transporter 


hemin permease 


tryptophanyl-tRNA synthetase 


hypothetical protein 




penicillin-binding protein 6B 
precursor 


hypothetical protein 


hypothetical protein 






uracil phosphoribosyltransferase 


bacterial regulatory protein, lad 
family 


N-acyl-L-amino acid amidohydrolase 
or peptidase 


phosphomannomutase 


dihydrolipoamide dehydrogenase 


pyruvate carboxylase 


hypothetical protein 


hypothetical protein 
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Yersinia enterocolitica hemU 


Escherichia coli K12trpS 
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dacD 


Mycobacterium tuberculosis 
H37Rv Rv3311 


Streptomyces coelicolor A3(2) 
SC6G1 0.08c 






Lactococcus Iactis upp 


Streptomyces coelicolor A3(2) 
SC1A2.11 


Mycobacterium tuberculosis 
H37Rv Rv3305c amiA 


Mycoplasma pirum BER manB 


Halobacterium volcanii ATCC 
29605 Ipd 


Corynebacterium glutamicum 
strain21253 pyc 


Mycobacterium tuberculosis 
H37Rv Rv1324 


Streptomyces coelicolor A3(2) 
SCF11.30 
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699998 


702081 


702108 


703405 


705211 


708839 


709793 


SEQ 

NO. 
(a.a.) 


4249 


4250 


4251 


4252 


4253 


4254 


4255 


4256 


4257 


4258 


4259 


4260 


4261 


4262 


4263 


4264 


4265 


4266 


4267 


SEQ 
NO. 
(DNA) 


CO 


o 

LO 


T — 

in 


CM 
LO 


CO 
LO 


LO 


LO 
LO 


CD 
LO 

r-- 


r- 

LO 


CO 
LO 


CD 
LO 

r-- 


o 

CD 


5 


CM 
CD 

r- 


CO 
CO 


co 


LO 
CD 


CD 
CO 


h- 

CD 



1 68 



in 
in 



T3 

CD 

c 
o 
o 



,C0 



c 
o 



3 



' C_ CO 
CO CD 



E — 

CO 



T3 



CO 

c 

CD 
CD 
CO 

O 

cn 
o 
o 

o 



"ca 



E 3 
a) 



2 9 S 

W 2 ro. 



° rS < 



CD 

"co 
c: 
o 

CD Q. 

8 2 

3 ° 
■i E 

>< O CO 

O t_ = 

T3 O- O 

CD X» 

>- CO 
CL O 



CN 



CN 



E O 



ca 



< 
CO 

I 

Q 
CL 

Q_* 
CL 



CD 



co 

CN 



CO 
CO 

h- 

CN 



O 
CN 



o 



o 
c 

CD 
O 
JZ 
CL 
CO 

o 



ST 

CO g 



GO 
CN 



CN 



O 
O 

CO 

p 



CJ 
>-» 

E 
o 
-♦— • 

CL 
CD 
« 

CO 



< 

CN 
CN 
CN 
O 
CD 

CL 



GO 
CO 
CO 



in 



LO 



co 

LO 
CN 



CN 



CD 
CO 

ca 



CD 

to 



CO 

00 
CO 



CO 



CO 



ca 

E 
cn 

CD 

E 

CO 

a) 
£ r- 
o o 
ca co 

-go 

o O 
^ < 



CO 

o 



>- 

CO 

a 



CN 
CO 



CO 
CO 
CN 
CO 



CN 
O 



CO 
CN 



co 



CD 

"o 
o 



o 

Ol 



CO 

in 



co 

CN 



o 



CD 
_Q 
=J 

O 

C CD 
.1 2 
2 > 

co ^ 

o q: 

o i^. 

>» CO 

^ X 



CD 
CO 

in 
o 

m 



CO 
CN 
CO 



CO 
CO 
CO 



CD 
O 
O 
CO 



tn 

CN 



LO 



c 
ca 

Z3 



CO 

a? 
13 



in 

CN 
CN 



O 
O 



O 
O 



E 
o 
E 

jO 
CD 



CD CN 
° O 



ra 



co 



CD 

Era 

o h- 
O < 



e> 
a: 
O 
a 

a 1 



CO 

o 

CD 



LO 
O 
CN 



CD 



co 

CN 



co 



ca 



CD 

ex 



o 

Q_ 



CO 
CO 



CO 



CD 

CO 
i_ 

o_ 

CD 



d o 

tj CN 
CO ^ 



CQ 

o 



CL 
CD 



LO 
CN 
CO 
CN 
CM 



co 
CO 
CO 
CO 
CN 



o 

CO 
CN 



o 
co 



c 

3 
o 

Q_ 
CD 



E 

CD 

E 



o 



CO 



co 
co 



LO 
CO 



o 

i_ 
CD 
JD 
3 

—* O 

E 10 
§ co 
.2 in 

CD ^ 

5 > 

o tr 

>> CO 
2 X 



CD 
CO 
LO 
O 

r-- 



CO 



CD 

in 
in 
to 

CN 



CO 
CN 



CO 
CN 



co 



LU 



CD 
"5 
CL 



o 

CL 



CO 
CO 



CD 

O 
CO 
CD 

co 

1 2 

^ CD 
co »j- 

CD ^ 
CO JO 



a) 5? 
o 

CD -Q 

aS co 
-a o 



CO 
LO 



CD 
CO 



CO 
CO 



CO 

o 

CO 
X — 

CD 
CD 

ca 

k- 

Q_ 

CD 



CD 
CO 

o V 
>^co 
S O 



a 

CO 
CO 
CO 
CO 
CN 
CO 
CN 



CD. 
CN 



O 

co 

CN 



co 
CN 



co 

CO 
CN 



co 

CO 



o 
o 



CD 
CD 



E 

O 

E 

CO 
CD 

.1- 

£ % 

-g CO 
c o 

o -> 
o < 



CO 
LO 
CO 

o 
m 

< 

CL 
O) 



CN 

co 

CN 



CN 

in 
co 
co 

CN 



co 

CN 



CO 



o 

CO 
CD 

^ CD 
^ CO 

> £ 

-I 
CD C 
CO CO 
-»-• 

c — 

CD ^ 

o 

a? fo 
-a o 



co 
in 



o 
o 
o 



co 

CD* 
CD 



E 
o 
E 
iS 

CD 

E 

CD 

aj co 

o —> 

a < 



CD 
CD 

O 
— > 

L_ 
CL 



CD 
CN 
CO 



CD 
CD 
CO 
CO 
CN 



CN 
CO 
O 
CO 



LO 
CO 
CN 



LO 
00 



1 6 



9 



Function 


bifunctional protein (biotin synthesis 
repressor and biotin acetyl-CoA 
carboxylase ligase) 


hypothetical membrane protein 


5'-phosphoribosyl-5-amino-4- 
imidasol carboxylase 


K+-uptake protein 






5'-phosphoribosyi-5-amino-4- 
imidasol carboxylase 


hypothetical protein 
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hypothetical membrane protein 
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Function 


hypothetical protein 


dTDP-Rha:a-D-GlcNAc- 
diphosphoryl poiyprenol, a-3-L- 
rhamnosyl transferase 


mannose-1 -phosphate 
guanylyltransferase 


regulatory protein 


hypothetical protein 


hypothetical protein 


phosphomannomutase 


hypothetical protein 


mannose-6-phosphate isomerase 






pheromone-responsive protein 




S-adenosyl-L-homocysteine 
hydrolase 






thymidylate kinase 
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Mycobacterium tuberculosis 
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Mycobacterium smegmatis 
mc2155 wbbL 
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YDL055C MPG1 


Mycobacterium smegmatis 
whmD 


Mycobacterium tuberculosis 
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Streptomyces coelicolor A3(2) 
SCE34.11C 


Salmonella montevideo M40 
manB 


Mycobacterium tuberculosis 
H37Rv Rv3256c 


Escherichia coli K12 manA 
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pCFIOprgC 




Trichomonas vaginalis WAA38 






Archaeoglobus fulgidus VC-16 
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LO 
CO 



CO 
LO 
CO 



LO 
CO 



LO 
LO 
CO 



-1 7 



3 











nt RNA 








icase 




icase 
















Function 


regulatory protein 


hypothetical protein 


hypothetical protein 


DEAD box ATP-depende 
helicase 




hypothetical protein 


hypothetical protein 


ATP-dependent DNA hel 




ATP-dependent DNA hel 




potassium channel 


hypothetical protein 


DNA helicase II 




hypothetical protein 




Matched 
length 
(aa) 


oo 


CO 
CN 


to 


CO 

to 




CD 
CN 


CO 
CN 


1155 




1126 




CN 
O 

CO 


o 

CO 
CN 


O 
CO 
CO 




o 

CO 
CN 










































96.4 


65.1 


62.2 


64.0 




69.8 


65.9 


48.9 




65.7 




64.2 


58.3 


58.8 




49.3 




CO 




































Identity 
(%) 


78.6 


33.3 


29,6 


37.3 




46.4 


37.0 


23.9 




41.4 : 




26.2 


30.4 j 


32.6 




26.8 




Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv3219whiB1 


Mycobacterium tuberculosis 
H37RvRv3217c 


Mycobacterium tuberculosis 
H37RvRv3212 


Klebsiella pneumoniae CG43 
deaD 




Mycobacterium tuberculosis 
H37Rv Rv3207c 


Mycobacterium tuberculosis 
H37Rv Rv3205c 


Mycobacterium tuberculosis 
H37Rv Rv3201c 




Mycobacterium tuberculosis 
H37Rv Rv3201c 




Methanococcus jannaschii JAL- 
1 MJ0138.1. | 


Mycobacterium tuberculosis 
H37Rv Rv3199c 


Escherichia coli K12 uvrD 




Mycobacterium tuberculosis 
H37RvRv3196 




db Match 


pir:D70596 


pir.B70596 


pir:E70595 


sp:DEAD_KLEPN 




pir:H70594 


|pir:F70594 


pir:G70951 




pir:G70951 




sp:Y13B_METJA 


pir:E70951 


sp:UVRD_ECOLI 




pir:B70951 




g£ 


CO 
LO 
CN 


o 

CN 


1200 


1272 


LO 
CN 
CN 


CO 

co 


CO 
LO 

h- 


3048 


o 

CO 

r-- 


3219 


1332 


1005 




2034 


CO 
LO 


CD 
CO 


CO 

o 

CD 


Terminal 
(nt) 


805535 


806737 


806740 


807946 


809510 


810394 


811163 


814217 


811386 


817422 


814210 


818523 


819236 


821287 


822669 


821290 


823391 


Initial 
(nt) 


805792 


806318 


807939 


809217 


809286 


809549 


810405 


811170 


812165 


814204 


815541 


817519 


818523 


819254 


822079 


822105 


822789 


SEQ 
NO. 
(a.a.) 


4356 


4357 


4358 


4359 


4360 


4361 


4362 


4363 


4364 


4365 


4366 


4367 


4368 


4369 


4370 


4371 


4372 


SEQ 
NO. 
(DNA) 


CO 
LO 
CO 


h- 

LO 
CO 


CO 

to 

CO 


CO 
LO 
CO 


o 

CO 
CO 


CD 
CO 


CN 
CO 
CO 


CO 
CO 
CO 


co 

CO 


LO 
CD 
CO 


CD 
CD 
CO 


r- 

CD 
CO 


CO 
CD 
CO 


CO 
CO 
CO 


o 
r-- 
co 


oo 


CN 
CO 



-174 































c 












Function 


hypothetical protein 


hypothetical protein 






hypothetical protein 


regulatory protein 


ethylene-inducible protein 


hypothetical protein 


hypothetical protein 




alpha-lytic proteinase precursor 




DNA-directed DNA polymerase 


major secreted protein PS1 protei 
precursor 










monophosphatase 


atched 
ength 




o 

LO 
CO 






1023 


CO 
CO 


5 

CO 


CO 


o 

CN 




CO 

o 
^* 




CO 

o 

CN 


CO 
CO 
CO 










LO 

in 

CN 


















































































ro o 




CO 






LO 




o 


o 


CO 










to 










CO 


CD 

i^- 


r-- 






CO 

r- 


in 


CO 
CO 


CO 
LO 


CO 

r^- 








LO 


LO 










"<T 

r^- 


CO 








































-♦— > 


co 








CN 


CO 




o 


CO 








O 


o 










CO 


I* 


CN 


CO 








CO 


CO 


CO 


o 




cb 

CN 




LO 
CN 


CN 










to 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv RV3195 


Mycobacterium tuberculosis 
H37Rv Rv3194 






Mycobacterium tuberculosis 
H37Rv Rv3193c 


Deinococcus radiodurans 
DR0840 


Hevea brasiliensis laticifer er1 


Aeropyrum pernix K1 APE0247 


Bacillus subtilis 168 yaaE 




Lysobacter enzymogenes ATCC 
29487 




Neurospora intermedia LaBelle- 
1b mitochondrion plasmid 


Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
17965 cspl 










Streptomyces alboniger pur3 


db Match 


pir:A70951 


pir:H70950 






pir:G70950 


gp:AE001938_5 


sp:ER1_HEVBR 


PIR:F72782 


sp:YAAE_BACSU 




pir:TRYXB4 




pir:S03722 


_j 

CD 
on 
O 
o 

1 

CL 
CO 

o 

cL 
co 










prf:2207273H 




1446 


1050 


LO 
h- 
CD 


CN 
CN 
LO 


2955 


1359 


LO 

CO 


LO 

co 


o 
o 

CO 


CO 
CO 
CO 


1062 


o 
to 


to 

CO 
LO 


1581 


CO 
CN 
Tj- 


o 

T 

LO 


CN 
CN 
CN 


CO 

o 

CO 


o 

CO 

r- 


Terminal 
(nt) 


822680 


825239 


825242 


825996 


829570 


829627 


831971 


831578 


832570 


832795 


834633 


835388 


835837 


838892 


839353 


840139 


840210 


840437 


841517 


Initial 
(nt) 


824125 


824190 


825916 


826517 


826616 


830985 


831021 


831922 


831971 


833157 


833572 


834888 


835253 


837312 


838925 


839630 


840431 


840745 


842296 


SEQ 

NO. 
(a.a.) 


4373 


4374 


4375 


4376 


4377 


4378 


4379 


4380 


4381 


4382 


4383 


4384 


4385 


4386 


4387 


4388 


4389 


4390 


4391 


SEQ 
NO. 
(DNA) 


CO 

co 


CO 


LO 
CO 


CO 

CO 


CO 


CO 
CO 


CO 
CO 


o 

CO 
CO 


CO 
CO 


CN 
CO 
CO 


CO 
CO 
CO 


CO 
CO 


LO 
CO 
CO 


CD 
CO 
CO 


CO 
CO 


CO 
CO 
CO 


CO 
CO 
CO 


o 

CO 
CO 


CO 
CO 



5 



c 
o 



CD 
CO 
CO 



CL 
O 

c 
o 
E 



o 
c 

A 

E 



o 

CO 



CO 

cd 

1_ 

c 

CO 



'a. 

CD 
Cl 



-O 

Gl 



> 

T3 



CD 

O 
i— 

Ql 

c 
o 

CO 

*> 



CD 

i_ — «. 

CL C 

= CD 
CO 

E 2 

CO CL 



<D 

-5 



o 



CO 
N 



CO 
O 

> 



CD 

O 
k_ 
Cl 

CD 



.O 

E 

CD 

E 



o 



CD 

-4— ' 

e 

Cl 
O) 
C 
TO 
C 



CD 

o 

CL 
CO 

c 

CO 



O 
CD 
< 

CD 
O CO 

o c 

"fc CD 
CD CL 



CD 



CD 

O 
CL 
CO 
C 
CO 

a 

CD 
< 

CD 

£ 

O CO 

JE £ 

o fc 

"t: a) 

CD CL 



CD 



CL 

CD 

o 

CL 
CO 

a 

E 

o . 

CD k_ 
< o 
CD "o 

P 
II 

CD = 



T3 
CD 



CO cd 



CO 

CN 



CO 
IT) 
CO 



CD 
CN 
CN 



O 
CO 



lo 



CD 



CN 
CN 



LO 
CN 
CO 



CO 
CO 



CN 
CO 



o 
to 

CN 



CO 

CO 



a> 

LO 



CO 

oo 



co 



in 



CN 

lo 



CD 



o 
oo 



CO 



CN 
CO 



CO 
CO 



CO 
CD 



O 



o 



CO 



CD 
CN 



CD 
CO 



CN 



CD 
CO 



LO 
CO 



cd 



CD 
Z3 

o 
o 



.a 
,ca 



CD 

c 

CD 
CO 



O 
CO 
JD 
O 

E 
o 
X 



CD 
CL 
O 
> 
CO 



CD 

o 



J"5 

CO S~ 



o 
o 

CO 
CD 

o 

2^ 



CD CO 
CO CL 



ai LU 
-S " 

zj 

o 
F CN 

*- co 

3 > 

CO 

_Q > 

o a: 
o r^- 
>*co 



CO 
CO 

3 
O 

<D ^ 
-D « 
ZJ ^ 
O 

1° 
S » 

_Q > 

o q: 

o 

CO 

S X 



CO 

CL 

E 



LU 



CD 
CO 



> > 



CL 

JD 

E r- 
.3 o 

CD 00 

g- 8q 

CO ^ ^ 



m 

LO 

E 

=3 



.a 
> 



co 

CD 



o 

CO 
CD 



o 



CO 
CD 



CO 



o 

CO 

m 



CO 
CD 



o 

CO 

m 



"co 



-a 



CD 

r-- 
co 
o 
r-- 

cL 

CO 



O 
O 

ct: 

f- 

co 

CN 1 
LL 
CH 

CL 



CD 
O 

LU 



CD 
O 

Q 



O 
O 
LU 

CL 
CO 

CL 



X 
O 

m 

> 

CL 



CO 
CN 

m 
O 



CL 
CO 



< 

CD 

6 



CO 
CD 

r-- 
co 

CD 
CO 



CO 
CD 

co 

CD 

o 



CO 
CD 

r*- 

CD 
CD 

a 



LL 

a: 
O 



CO 
— 
CO 



o 



CO 
CD 



O 
O 
CO 



CN 
CO 

N" 



CN 
CO 



CO 

co 
in 



o 



CO 
CO 
CO 



CN 
CO 



CO 

in 



co 
c 

E 

CD 



CD 
O 
CO 
CN 

co 



o 

CO 
CO 

co 



m 
co 



co 
o 
to 

CO 



CO 
CN 
CO 
CD 
N" 
CO 



CO 
CO 
N- 
CO 

co 



o 
m 

CO 



N" 
CD 
CO 
CN 
LO 
CO 



CO 
CO 
LO 
CO 



CN 

to 

CO 



CD 

LO 
LO 
CO 



CN 
CO 
CO 



LO 
CN 
CO 

CO 



LO 
CO 

co 



CO 
CO 

LO 

CO 



CD 

co 



CO 
CM 
CO 
CO 

co 



CO 

CO 
CO 

o 

LO 
CO 



LO 
CO 



LO 
CO 



CD 
CM 
LO 
CO 



CO 
CO 

CO 
LO 
CO 



<N 

LO 
CO 



a p: 
lu y 

CO 2 



CN 
CO 
CO 



co 

CO 



CD 
CO 



CD 
CO 
CO 



CD 
CO 
N- 



CN 
O 
"3" 



o 



LO 

o 



CD 
O 



o 



co 
o 



a 0 < 

co ^ 9, 



CN 
CO 
CO 



CO 
CO 



"3" 

C0 

oo 



CD 
CO 
CO 



CO 
CO 



CM 

o 

CO 



o 

CO 



LO 

o 

CO 



CD 

o 

CO 



o 

CO 



CO 

o 

CO 



76 



c 



Function 


hypothetical protein 


hypothetical protein 


kynurenine 

aminotransferase/glutamine 
transaminase K 




DNA repair helicase 


hypothetical protein 


hypothetical protein 




resuscitation-promoting factor 


cold shock protein 


hypothetical protein 


glutamine cyclotransferase 






permease 




rRNA(adenosine-2'-0-)- 
methyltransferase 




Matched 
length 


CO 


CO 


CM 




CO 
CD 


CD 
h- 


h- 

LO 




CO 
CO 

T — 


CO 


CO 
LO 


CO 
CN 










CD 
CO 




Similarity 
(%) 


72.0 


j 66.0 


64.9 




62.3 


65.2 


62.0 




64.7 


75.4 


58.5 


67.8 






79.3 




LO 




Identity 
(%) 


66.0 


61.0 


33.5 




S 30.7 

i 


36.1 


44.0 




39.4 


42.6 


28.3 


41.8 






CD 

CO* 




27.9 




Homologous gene 


Chlamydia muridarum Nigg 
TC0129 


Chlamydia pneumoniae 


Rattus norvegicus (Rat) 




Saccharomyces cerevisiae 
S288C YIL143C RAD25 


Mycobacterium tuberculosis 
H37Rv Rv0862c 


Mycobacterium tuberculosis 
H37Rv Rv0863 




Micrococcus luteus rpf 


Lactococcus lactis cspB 


Mycobacterium leprae 
ML.CB57.27c 


Deinococcus radiodurans 
DR0112 






Streptomyces coelicolor A3(2) 
SC6C5.09 




Streptomyces azureus tsnR 




db Match 


PIR:F81737 


GSP:Y35814 


pir:S66270 




sp:RA25_YEAST 


pir:F70815 


pir:G70815 




prf:2420502A 


prf:2320271A 


gp:MLCB57J1 


gp:AE001874_1 






gp:SC6C5_9 




f— 

CO 

I 

-z. 

CO 
\— 

CL 

to 




ORF 

V D P J 




CO 

r- 

CM 


1209 


CO 
CO 

co 


1671 


2199 


CO 
CN 


CO 
CO 


co 

LO 


CO 
CO 


LO 
CN 
LO 




CO 
CO 
CO 


CO 
CO 


1473 


CM 
CO 


CO 
CM 
CO 


CO 

r-- 
co 


Terminal 
(nt) 


860078 


860473 


862752 


862753 


863396 


865119 


867571 


868630 


867803 


869318 


869379 


869918 


870721 


871660 


873210 


872016 


874040 


874069 


Initial 
(nt) 


860224 


860745 


861544 


863391 


865066 


867317 


867353 


1 867788 

i 


I 868399 


868938 


869903 


870691 


871419 


871523 


871738 


872927 


873213 


874944 


SEQ 
NO. 
(a.a.) 


4409 


4410 


4411 


4412 


4413 


4414 


4415 


4416 


4417 


4418 


4419 


4420 


4421 


4422 


4423 


4424 


4425 


4426 


SEQ 
NO. 
(DNA) 


CO 

o 

CO 


o 

CO 


CO 


CM 
CO 


CO 

T 

CO 


CO 


LO 
CO 


CD 
CO 


I s - 

x — 

CO 


CO 

T — 

CD 


CO 
CO 


o 

CN 
CO 


CM 
CO 


CM 
CM 
CD 


CO 
CN 
CO 


CM 
CO 


LO 
CM 
CO 


CO 
CN 
CD 



-177 - 



Function 


hypothetical protein 


phosphoserine transaminase 


acetyl-coenzyme A carboxylase 
carboxy transferase subunit beta 


hypothetical protein ' 

, 


sodium/proline symporter 




hypothetical protein 


fatty-acid synthase 






homoserine O-acetyltransferase 






glutaredoxin 


dihydrofolate reductase 


thymidylate synthase 


ammonium transporter 


ATP dependent DNA helicase 


formamidopyrimidine-DNA 
glycosidase 


Matched 
length 
(aa) 


CO 
CO 


co 


CO 
CO 
CM 


CO 

o 


CD 
LO 




CO 
CN 


3026 






LO 

CO 
CO 






CM 
CO 


r— 


CO 
CN 


CM 

o 

CM 


1715 


CO 
CD 
CN 


milarity 
(%) 


55.1 


52.9 


69.5 


80.6 


58.1 




77.4 


83.4 ' 






r-- 
co 

LO 






72.6 


62.0 


88.9 


56.4 


68.1 


51.0 


to 








































Identity 

(%) 


32.6 


21.9 


36.0 


51.5 


26.4 




49.0 


63.1 






29.0 






CD 
CO 


38.0 


64.8 


32.2 


47.4 


29.2 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv0883c 


Bacillus circulans ATCC 21783 


Escherichia coli K12 accD 


Streptomyces coelicolor A3(2) 
SCI8.08C 


Pseudomonas fluorescens 




Mycobacterium tuberculosis 
H37Rv Rv2525c 


Corynebacterium 
ammoniagenes fas 






Leptospira meyeri metX 






Deinococcus radiodurans 
DR2085 


Mycobacterium avium folA 


Escherichia coli K12thyA 


Escherichia coli K12 cysQ 


Streptomyces coelicolor A3(2) 
SC7C7.16C 


Synechococcus elongatus 
naegeli mutM 


db Match 


sp:YZ11_MYCTU 


pir:S71439 


sp:ACCD_ECOLI 


gp:SCI8_8 


pir:JC2382 




pir:A70657 


pir:S55505 






prf:2317335B 






gp:AE002044_8 


prf;2408256A 


sp:TYSY_ECOLI 


sp:CYSQ_ECOLl 


gp:SC7C7_16 


sp:FPG_SYNEN 


ORF 
(bp) 


CO 
CO 
CO 


1128 


1473 


CD 
CO 
CO 


1653 


CO 
x — 
CO 


o 

CO 


8907 


CD 
CO 


CO 
CO 


1047 


CO 
CM 


h- 
co 

CM 


CO 
CM 


CO 
LO 


CO 
CD 


CO 
LO 

h- 


4560 


CO 
CD 

r*- 


Terminal 
(nt) 


874951 


875985 


879642 


881985 


883647 


884541 


i 884549 


894578 


895191 


895593 


895596 


896719 


897689 


897727 


897979 


898434 


899253 


904602 


905382 


Initial 
(nt) 


875883 


877112 


881114 


881647 


881995 


883726 


885388 


885672 


894703 


895408 


896642 


897144 


897423 


897963 


898434 


899231 


900008 


900043 


904615 


SEQ 
NO. 
(a.a.) 


4427 


4428 


4429 


4430 


4431 


4432 


4433 


4434 


4435 


4436 


4437 


4438 


4439 


4440 


4441 


4442 


4443 


4444 


4445 


SEQ 
NO. 
(DNA) 


CM 
CO 


CO 
CM 
CD 


CD 
CM 
CD 


o 

CO 
CD 


T 

CO 
CD 


CM 
CO 
CD 


CO 
CO 
CD 


co 

CD 


LO 
CO 

r 


CD 
CO 
CD 


h- 

CO 
CD 


CO 
CO 
CD 


CD 
CO 
CD 


o 

CD 


CD 


CM 
CD 


CO 
CD 


CD 


LO 
CD 



-178 - 



=3 



Function 


hypothetical protein 


alkaline phosphatase 


integral membrane transporter 




I glucose-6-phosphate isomease 


hypothetical protein 




hypothetical protein 


ATP-dependent helicase 


ABC transporter 


ABC transporter 




peptidase 


hypothetical protein 




5-phosphoribosylglycinamide 
formyltransferase 


5'-phosphoribosyl-5-aminoimidazole- 
4-carboxamide formyltransferase 


citrate lyase (subunit) 


Matched 
length 
(aa) 


CO 
CM 


CO 
O) 

T— 


co 
o 




to 

IO 


to 

CO 




CO 


CO 
CO 

1^ 


LO 
CO 
CO 


CN 




CO 
CO 
CN 


^- 

co 




CD 
CO 


to 

CN 
LO 


CM 












































































Similaril 
(%) 


86.7 


CD 


67.0 




77.0 


52.3 




85.9 


co 


48.6 


71.4 




73.3 


60.8 




86.2 


87.8 


100.0 


Identity 
(%) 


55.5 


38.8 


33.8 




52.4 


24.6 




59.0 


46.1 


21.8 


43.8 




43.6 


CO 




64.6 


74.5 


o 

o 
o 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv0870c 


Lactococcus lactis MG1363 apl 


Streptomyces coelicolor A3(2) 
SCI28.06c 




Escherichia coli JM101 pgi 


Mycobacterium tuberculosis 
H37Rv Rv0336 




Mycobacterium tuberculosis 
H37Rv Rv0948c 


Bacillus stearothermophilus 
NCA 1503 pcrA 


Streptomyces coelicolor A3(2) 
SCE25.30 


Bacillus subtilis 168 yvrO 




Mycobacterium tuberculosis 
H37Rv Rv0950c 


Mycobacterium tuberculosis 
H37Rv Rv0955 




Corynebacterium . 
ammoniagenes purN 


Corynebacterium 
ammoniagenes purH 


Corynebacterium glutamicum 
ATCC 13032 citE 


db Match 


pir:F70816 


sp:APL_LACl_A 


pir:T36776 




pirNUEC 


pir;G70506 




sp;YT26_MYCTU 


sp:PCRA_BACST 


o 

CO 

IO 1 

CN 

ai 
o 

CO 

CL 
CD 


prf:2420410P 




pir:D70716 


sp:YT19_MYCTU 




gp:AB003159_2 


gp:AB003159_3 


gp:CGL133719_3 


n 


CO 

o 
^- 


o 
o 

CO 


1173 


r-- 
r-- 


1620 


1176 


CO 
CO 


CD 
O 
CO 


2289 


2223 


CD 
CD 
CO 


o 
to 




1425 


CO 
CN 
CN 


CN 
CO 


1560 


CD 
CO 


Terminal 
(nt) 


905796 


905792 


906559 


909328 


907759 


909521 


911223 


910855 


913514 


913477 


915699 


916368 


916970 1 


919352 


917827 


919956 


921526 


922412 


Initial 
(nt) 


905389 


906391 


907731 


908612 


909378 


910696 


910843 


911163 


911226 


915699 


916364 


916874 


917680 


917928 


918054 


919330 


919967 


921594 


SEQ 

NO. 
(a.a.) 


4446 


4447 


4448 


4449 


4450 


4451 


4452 


4453 


4454 


4455 


4456 


4457 


4458 


4459 


4460 


4461 


4462 


4463 


SEQ 
NO. 
(DNA) 


CD 
CO 


CO 


CO 
CD 


CO 
CO 


o 
to 

CO 


to 

CD 


CN 
LO 
CD 


CO 
LO 
CD 


LO 
CD 


LO 
LO 
CD 


CD 

to 

CD 


to 

CD 


CO 
LO 
CD 


CD 

to 

CD 


o 

CD 
CD 


CO 
CD 


CM 
CO 
CD 


CO 
CD 
CD 



17? 



CD 



C 



Function 


repressor of the high-affinity (methyl) 
ammonium uptake system 


hypothetical protein 




30S ribosomal protein S18 


30S ribosomal protein S14 


SOS ribosomal protein L33 


SOS ribosomal protein L28 


transporter (sulfate transporter) 


Zn/Co transport repressor 


SOS ribosomal protein L31 


SOS ribosomal protein L32 




copper-inducible two-component 
regulator 


two-component system sensor 


proteinase DO precursor 


molybdopterin biosynthesis cnxl 
protein (molybdenum cofactor 
biosynthesis enzyme cnxl) 




large-conductance 
mechanosensitive channel 


hypothetical protein 


5-formyltetrahydrofolate cyclo-ligase 


Matched 
length 
(aa) 


CN 
CN 
CN 


CD 

o 




CD 


o 
o 


CD 


I s - 


CD 
CN 
LO 


o 

CO 


CO 

I s - 


LO 
LO 




r- 

CN 
CN 


CO 


CO 

o 


CO 
CO 




CO 


o 

CN 


CD 


Similarity 
(%) 


100.0 


100.0 




76.1 


80.0 


83.7 


81.8 


71.1 


77.5 


65.4 


78.2 




73.6 


60.1 


59.9 


54.3 




I'll 


60.0 


59.7 


Identity 
(%) 


100.0 


100.0 




52.2 


54.0 


55.1 


, 52.0 


34.4 


37.5 


37.2 


60.0 




48.0 


24.4 


33.3 


27.7 




50.4 


28.6 


25.1 


Homologous gene 


Corynebacterium glutamicum 
ATCC 13032 amtR 


Corynebacterium glutamicum 
ATCC 13032 yjcC 




Cyanophora paradoxa rps18 


Escherichia coli K12 rpsN 


Escherichia coli K12 rpmG 


, Escherichia coli K12 rpmB 


Bacillus subtilis 168yvdB 


Staphylococcus aureus zntR 


Haemophilus ducreyi rpmE 


Streptomyces coelicolor A3(2) 
SCF51A.14 . 




Pseudomonas syringae copR 


Escherichia coli K12 baeS 


Escherichia coli K12 htrA 


Arabidopsis thaliana CV cnxl 




Mycobacterium tuberculosis 
H37Rv Rv0985c mscL 


Mycobacterium tuberculosis 
H37Rv Rv0990 


Homo sapiens MTHFS 


db Match 


gp:CGL133719_2 


gp:CGL133719_1 




sp:RR18_CYAPA 


sp:RS14_ECOLI 


sp:RL33_ECOLI 


pir:R5EC28 


pir:B70033 


prf:2420312A 


sp:RL31_HAEDU 


gp:SC51AJ4 




sp:COPR_PSESM 


sp:BAES_ECOLI 


pir:S45229 


sp:CNX1_ARATH 




sp:MSCL_MYCTU 


pir:A70601 


pir:JC4389 


n 


CD 
CD 
CD 


CN 
CO 


CN 

CO 


CD 
CN 


CO 

o 

CO 


CN 
CO 


CO 
CN 


1611 


CN 
CO 


^a- 

CD 
CN 


I s - 


I s - 


CO 
CD 
CO 


1365 


1239 


LO 
CO 
LO 


CO 
CD 


LO 
O 


LO 
CD 


o 
I s - 

LO 


Terminal 
(nt) 


922396 


923138 


923981 


924159 


924425 


924734 


924901 


925325 


926931 


927737 


927922 


927339 


928812 


930248 


931648 


932290 


932487 


932570 


933060 


933733 


Initial 
(nt) 


923061 


923464 


923661 


924407 


924727 


924895 


925134 


926935 


927242 


927474 


927752 


927785 


928117 


928884 


930410 


931706 


932290 


932974 


933710 


934302 


SEQ 
NO. 
(a.a.) 


4464 


4465 


4466 


4467 


4468 


4469 


4470 


4471 


4472 


4473 


4474 


4475 


4476 


4477 


4478 


4479 


4480 


4481 


4482 


4483 


SEQ 
NO. 
(DNA) 


CD 
CD 


LO 
CD 
CD 


CD 
CD 
CD 


I s - 

CO 
CD 


CO 
CO 
CD 


CD 
CO 
CD 


o 
r-- 

CD 


I s - 

CD 


CN 
CD 


CO 

r- 

CD 


I s - 

CD 


LO 

I s - 

CD 


CO 
CD 


h- 
cn 


CO 

r-- 

CD 


CD 

r-- 

CD 


o 

CO 
CD 


CO 
CD 


CN 
CO 
CD 


CO 
CO 
CD 



-180 



c 
o 



"D 

co -c ^ 



CO 

|* 

CO 



p 



in 



Efj 



CD 



o 
o 



,co 



CD 

c 



o 
cn 
c> 
o 
E 
o 
X 



E 

CD 



TO 



2§ s 



LU O z 



CD 
CO 

x: 

</) 
o 

CD CD 

s e 

If 



CD 
CD 
CN 



CO 
CD 



CN 
CN 



CD 
CL 

E 

CO 



o 
E 
o 



cn 

00 

cn 

O 
—> 

CL 



CD 



CO 

m 

CO 
CD 



CO 
CN 

co 
cn 



CO 

cn 



CO 

i_ 

co 
"co 



c 
>> 

CO 

o 
In 



o_ 
o 
■a 

_Q 

o 
E 



o 

CD 
CO 



CN 
CD 



CO 
CO 



XI 

2 < 

-£= CO 

< E 



cn 

CO 

cn 

CN 
CO 

o 

CN 

t: 

CL 



to 

CN 



o 

CD 
CD 
CO 
CD 



LO 
CO 

io 

CO 
CD 



in 

CO 



LO 
CO 

cn 



i 

"Z 

CO 

E 
*c 
_cg 

CD 

i 

C CD 
CO « 

IS 2 

JL 4/1 
CO 

E _ 

CO ^ 

O CO 

-Q O 

~ CO 



CO 



CO 

cn 



LO 



o 

CD 
CN 



CN 
5 



CO 

x: 
o 

CO 
LU 



o 
o 



CL 



O 
CO 
CD 



r-- 

CN 

co 
cn 



CD 
CD 
CO 
CO 



CD 
CO 



CO 
CO 
CO 



c 

3 
"o 

k_ 
CL 
CO 

c. 

CO 

XI 
E 

CO 

E 



o 

CL 



1^- 

CD 
CO 



CO 

m 



o 

CO 



Is 

CO 

_Q > 

o or 

O h- 

>-co 
2 X 



o 

CD 
O 

O 



o 

CN 
O 



O 

co 

CO 
CO 



CN 
CO 
CO 
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co 

CO 



CO 



CO 

cn 



c 

CD 
O 
Q_ 
CO 

c 

CO 
XI 

E 

CD 

E 



o 

CL 
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o 

CD 



CN 
CO 



CO 
CL 

O CN 

E ° 
S to 

CO — 

X X 



< 

X 

c 1 

o 

CD 
> 
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o 



o 

CO 

o 
o 

co 



CD 
CO 
CD 
CO 
CO 
CO 



o 

CO 



o 

CO 
CO 



0) 

"o 

I— 
CL 

CO 

c 

CO 
k— 
XI 

E 

CO 

E 

CO 

o 

to 
x: 

-♦-» 

o 

Cl 



in 

CN 
CN 



CO 

m 



CO 

to 

CN 



o 
o 

CO 
X) 
ZJ 

o 
E 00 

" a: 

n > 

o a: 

o 

>*co 
^ X 



O 
>- 



a 

LO 

o 
>- 



r— 



o 
co 



o 
o 

co 



CO 



CO 
CO 
(0 

c 

1 

o 

CO 

E 
o 



CO 

in 



CD 
CN 



3 co 

= CO 

o <° 

CO LJ 

CD O 



X 
CO 

CD 

I 

CO 

< 
a 
o 



CD 



m 

CN 
CO 

CO 



CO 

in 

o 
co 



CN 
CO 



CN 
CO 
CO 



CO 

E 

CO 

E 



o 

CL 



CO 
CO 



co 



CN 

o 

CD 
O 

LU 



o 

CD 

in 



CO 
CO 
CN 

^J- 

CD 



cn 

CO 
CD 



"o 

Q. 

~co 
o 

"co 
x: 
o 

CL 



CN 
CN 



CO 



lO 



CO 


CO 




CO 


CO 




o 


O 










o 

k_ 


o 

I— 




CD 


CO 




XI 


XI 




=3 












aini 


ium 


CO 

o 
o 


CO 


CO 


T — 

> 




t) 


or 


m ^ 


CO 




o > 


XI 


> 


O 


o 


a: 


o r— 


o 




>%co 


5^ CO 


^ X 




X 



o 



co 

x — 



LO 
CN 
CO 



CO 
CO 
CO 

cn 



co 
cn 



co 
co 
co 



cn 
o 
o 

co 



cn 



CD 
CD 



CO 
"CO 



< 
~z. 

>> 
c 

JC 
CD 

E 



in 

CO 



CD 
CO 



CO 
CO 



X 
aj 

"co 
Q 

E 

E 3 
E x: 

a5 % \D 
-*—> c* 

O -3 CO 
CO o p 

c o <o 

CO c LO 

a E X 
CO co h- 



2 
> 
to 



o 

CO 
CO 



CO 
CO 
CO 
CO 

CO 



o 

CO 
CD 

co 



m 

CO 



in 

co 
co 



c 
"5 



o 

x: 



o 

CN 



CO 

m 



co 
r— 

CN 



X 

CO 

"co 
Q 

E 

E 3 
3 xi 

'ZZ CL 
CO o 

CO o 

XI ^ 

O CO CD 

C n CD 

TO rN 

£EI 
co co H 

«r— ( — ^r- 



CD 
O 
CN 
CO 
CO 
CD 



CO 
CO 
CO 



CO 
CN 
CO 

o 

LO 
CO 



O 
CO 

in 
cn 



r-- 
co 



cn 
cn 



s 

CL 

To 

o 



o 

Cl 



o 

Cl 



^3" 
CO 



CO 
CO 

in 



o 

CO* 
CO 



CD 

CO 

CO 
CD 



o 

CO 
CD 



3 

o 
o 
o 
o 
o 



co 
a 
< 

CD 

t 



CN 

CD 
CN 

o 
u_ 

< 

CL 

cn 



CO 

in 



cn 

CN 



CO 

o 

CO 
LO 
CD 



CO 
CO 
CN 

LO 

CO 



CO 
CO 



CO 

r^- 

LO 
CO 
LO 
CO 

CO 
CO 



co 

CD 
CO 
lO 
CO 

O 

o 

LO 



CO 
CO 
CO 



o 
o 
o 
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{ 



Function 


transposase 


transposase subunit 




D-lactate dehydrogenase 


site-specific DNA-methyltransferase 




transposase 


transposase 


transcriptional regulator 


cadmium resistance protein 




hypothetical protein 


hypothetical protein 


dimethyladenosine transferase 


isopentenyl monophosphate kinase 




ABC transporter 


pyridoxine kinase 


hypothetical protein 


hypothetical protein 


Matched 
length 


CD 
CO 


CN 




LO 
CO 
LO 


CO 
CN 




O 


CD 
CO 


CD 


to 
o 

CN 




CO 
CO 
CN 


CN 
CD 
CO 


LO 
CD 
CN 


LO 

CO 




CO 

I s - 


CN 
CN 


CD 
LO 


CO 

o 


Similarity 
(%) 


67.6 


88.4 




75.6 


62.8 




59.6 


67.6 


84.6 


66.8 




70.7 : 


63.5 I 


65.3 


67.0 




85.8 


67.4 


58.5 


78.7 


Identity 

(%) 


I s - 


73.2 




46.4 


30.8 




33.0 




62.6 


31.7 




46.4 


34.8 


34.3 


42.5 




65.5 


40.1 


27.0 


45.4 


Homologous gene 


Escherichia coli K12 


Brevibacterium linens tnpA 




Escherichia coli did 


Klebsiella pneumoniae OK8 
kpnIM 




Enterococcus faecium 


Escherichia coli K12 


Mycobacterium tuberculosis 
H37Rv Rv1994c 


Staphylococcus aureus cadD 




Mycobacterium tuberculosis 
H37RvRv1008 


Mycobacterium tuberculosis 
H37RvRv1009 rpf 


Escherichia coli K12 ksgA 


Mycobacterium tuberculosis 
H37Rv Rv1011 




Saccharopolyspora erythraea 
ertX 


Escherichia coli K12 pdxK 


Mycobacterium tuberculosis 
H37Rv Rv2874 


Streptomyces coelicolor A3(2) 
SCF1.02 


db Match 


pir:TQECI3 


gp:AF052055_1 




prf:2014253AE 


sp:MTK1_KLEPN 




gp:AF029727_1 


pir:TQECI3 


sp:YJ94_MYCTU 


prf:2514367A 




pir:C70603 


pir:D70603 


sp:KSGA_ECOLI 


pir:F70603 




pir:S47441 


sp:PDXK_ECOLI 


sp:YX05_MYCTU 


gp:SCF1_2 


8l 




■^r 


CD 

oo 


1713 


o 

TT 
CO 


CD 
CN 


cn 

CN 


r-- 


h- 

LO 
CO 


CN 
CD 


CN 

co 


CO 
CO 


1071 


CD 
CO 


CO 
CO 
CD 


CN 
CD 


1833 


CN 
CD 

I s - 


o 

CO 
T 


CN 
CO 


Terminal 
(nt) 


954753 


955354 


956774 


955686 


957844 


959185 


960374 


960861 


961653 


962249 


961321 


963639 


964934 


965852 


966784 


965950 


968660 


969458 


969461 


970349 


Initial 
(nt) 


954277 


954941 


955911 


957398 


958683 


959403 


960081 


960385 


961297 


| 961629 


961662 


962809 


963864 


964974 


965852 


966591 


966828 


968667 


969940 


970029 


SEQ 

NO. 
(a.a.) 


4501 


4502 


4503 


4504 


4505 


4506 


4507 


4508 


4509 


4510 


4511 


4512 


4513 


4514 


4515 


4516 


4517 


4518 


4519 


4520 




T— 

o 
o 


CN 
O 
O 


CO 

o 
o 


o 
o 


lO 

o 
o 


CO 

o 
o 


ZOO I 


CO 

o 
o 


CD 
O 
O 


o 
o 


o 


CN 

T— 

o 


CO 

o 


o 


to 
o 


CD 
O 


o 


CO 

o 


CD 
O 


o 

CN 
O 



-182 - 



=3 



Function 


hypothetical protein 


regulator 


hypothetical protein 


enoyl-CoA hydratase I 








major secreted protein PS1 protein 
precursor 


transcriptional regulator (tetR 
family ) 


membrane transport protein 


S-adenosylmethionine:2- 

demethylmenaquinone 

methyttransferase 




hypothetical protein 


hypothetical protein 




peptide-chain-release factor 3 


amide-urea transport protein 


Matched 
length 
(aa) 


o 


T — 

CD 
CN 


CD 

r-- 

CN 


r*- 
co 

CO 








o 


o 
o 


CN 

o 

CO 


LO 




CN 


CN 
CO 




CD 
T 

LO 


o 


milarity 
(%) 


69.2 


88.1 


59.1 


70.9 








56.8 


70.0 


70.0 


75.8 




63.6 


48.3 




68.0 


72.8 


GO 




































Identity 
(%) 


LO 

LO 

CO 


64 : 8 


27.2 


35.6 








27.7 


44.0 


42.6 


CN 
CO* 
CO 




CO 
CD 
CN 


24.9 




39.2 


42.8 


Homologous gene 


Streptomyces coelicolor A3(2) 
SCF1.02 


Streptomyces coelicolor A3(2) 
SCJ1.15 


Bacillus subtilis 168 yxeH 


Mycobacterium tuberculosis 
H37Rv echA9 








Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
17965 cspl 


Streptomyces coelicolor A3(2) 
SCF56.06 


Streptomyces coelicolor A3(2) 
SCE87.17C 


Haemophilus influenzae Rd 
HI0508 menG 




Neisseria meningitidis NMA1953 


Mycobacterium tuberculosis 
H37Rv Rv1128c 




Escherichia coli K12 prfC 


Methylophilus methylotrophus 
fmdD 


db Match 


gp:SCF1_2 


gp:SCJ1_15 


sp:YXEH_BACSU 


pir:E70893 








sp:CSP1_CORGL 


gp:SCF56_6 


gp:SCE87_17 


sp:MENG_HAElN 




gp:NMA6Z2491 21 
4 


pir:A70539 




pir:l59305 


prf:2406311A 


Si 


CN 
CO 


o 

CD 

cn 


CN 
CD 

r- 


1017 


LO 
CD 


ILL 


1212 


1386 


CD 
LO 


2373 


CO 
CO 

■^r 


CD 
CD 
CD 


t — 

CO 

CO 


1551 


CD 
CO 
CD 


1647 


1269 


Terminal 
(nt) 


970738 


971823 


972244 


974155 


973304 


974962 


974965 


977734 


I 977800 


978368 


981490 


982287 


982294 


984650 


985845 


984864 


988007 


Initial 
(nt) 


970418 


970864 


973035 


973139 


973957 


974186 


976176 


976349 


978378 


980740 


980993 


981622 


982674 


983100 


984910 


986510 


986739 


SEQ 

NO. 
(a.a.) 


4521 


4522 


4523 


4524 


4525 


4526 


4527 


4528 


4529 


4530 


4531 


4532 


4533 


4534 


4535 


4536 


4537 


SEQ 
NO. 
(DNA) 


CN 
O 


CN 
CN 
O 


CO 
CN 
O 


CN 
O 


LO 
CN 
O 


CD 
CN 
O 


h- 

CN 
O 


CO 
CN 
O 


CD 
CN 
O 


o 

CO 

o 


CO 

o 


CN 
CO 
O 


CO 
CO 

o 


CO 

o 


LO 
CO 
O 


CD 
CO 

o 


r*- 
co 
o 
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Function 


amide-urea transport protein 


amide-urea transport protein 


high-affinity branched-chain amino 
acid transport ATP-binding protein 


high-affinity branched-chain amino 
acid transport ATP-binding protein 


peptidyl-tRNA hydrolase 


2-nitropropane dioxygenase 


glyceraldehyde-3-phosphate 
dehydrogenase 


polypeptides predicted to be useful 
antigens for vaccines and 
diagnostics 


peptidyl-tRNA hydrolase 


SOS ribosomal protein L25 


lactoylglutathione lyase 


DNA alkylation repair enzyme 


ribose-phosphate 
pyrophosphokinase 


UDP-N-acetylglucosamine 
pyrophosphorylase 




sufl protein precursor 


nodulation ATP-binding protein I 


Matched 
length 
(aa) 


r*- 


CO 
CN 


CO 
LO 
CN 


CO 
CO 
CN 


r-- 
co 


CO 
CO 


CN 

co 


LO 




cn 


CO 


CO 

o 

CN 


CD 

T 

CO 


CN 

LO 




909 


o 

CO 


Similarity 
(%) 


61.0 


68.0 


70.0 


69.1 


70.6 


54.0 


72.8 


61.0 


63.2 


65.0 


54.6 


62.5 


79.1 


71.9 




61.7 


CO 
CO 


Identity 
(%) 


40.8 


34.6 


37.9 


35.2 


39.0 


25.2 


39.5 


54.0 


38.5 


47.0 


28.7 


38.9 


44.0 


42.0 




30.8 


35.8 


Homologous gene 


Methylophilus methylotrophus 
fmdE 


Methylophilus methylotrophus 
fmdF 


Pseudomonas aeruginosa PAO 
braF 


Pseudomonas aeruginosa PAO 
braG 


Escherichia coli K12 pth 


Williopsis mrakii IFO 0895 


Streptomyces roseofulvus gap 


Neisseria meningitidis 


Escherichia coli K12 pth 


Mycobacterium tuberculosis 
H37Rv rplY 


Salmonella typhimurium D21 
gloA 


Bacillus cereus ATCC 10987 
alkD 


Bacillus subtilis prs 


Bacillus subtilis gcaD 




Escherichia coli K12 sufl 


Rhizobium sp. N33 nodi 


db Match 


prf:2406311B 


prf:2406311C 


sp:BRAF_PSEAE 


sp:BRAG_PSEAE 


sp:PTH_ECOLI 


sp:2NPD_WILMR 


sp:G3P_ZYMMO 


GSP:Y75094 


sp:PTH_ECOLI 


pir:B70622 


sp:LGUL_SALTY 


prf:2516401BW 


sp:KPRS_BACCL 


pir:S66080 




_i 
o 
o 

LU 

I 

LL 

CO 
cL 

CO 


sp:NODI_RHIS3 


ORF 


CN 
CO 
CO 


1077 


CO 
CN 

t^- 


cn 
cn 

CO 


CN 
CO 


1023 


1065 


CO 
CD 
CO 


CO 
LO 


o 
o 

CD 


cn 

CN 


CN 
CD 


LO 
CO 


1455 


1227 


1533 


CO 

cn 


Terminal 
(nt) 


988904 


989980 


990705 


991414 


991417 


993080 


994613 


994106 


994845 


995527 


996830 


996833 


997466 


998455 


1000016 


1002864 


1003930 


Initial 
(nt) 


988023 


988904 


989980 


990716 


992028 


992058 


993549 


994474 


995375 1 


996126 


996402 


997456 


998440 


606666 


1001242 


1001332 


1003013 


SEQ 
NO. 
(a.a.) 


4538 


4539 


4540 


4541 


4542 


4543 


4544 


4545 


4546 


4547 


4548 


4549 


4550 


4551 


4552 


4553 


4554 


SEQ 
NO. 
(DNA) 


CO 
CO 
O 


cn 

CO 

o 


o 
o 


O 


CN 

o 


CO 

o 


o 


LO 

o 


CD 

o 


o 


CO 
O 


cn 

O 


o 
to 
o 


LO 

o 


CN 

LO 

O 


CO 
LO 
O 


LO 
O 



-184 



13 











































c 












































"tu 




Function 


hypothetical membrane protein 


two-component system sensor 
histidine kinase 


two component transcriptional 
regulator (luxR family) 




hypothetical membrane protein 


ABC transporter 




ABC transporter 


gamma-glutamyltranspeptidase 
precursor 










transposase protein fragment 


transposase (IS1628 TnpB) 








transcriptional regulator (TetR- 
family) 


transcription/repair-coupling prol 
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Homologous gene 


Streptomyces lividans ORF2 


Escherichia coli K12 uhpB 


Streptomyces peucetius dnrN 




Streptomyces coelicolor A3(2) 
SCF15.07 


Streptomyces glaucescens strV 




Mycobacterium smegmatis exiT 


Escherichia coli K12 ggt 










Corynebacterium glutamicum 
TnpNC 


Corynebacterium glutamicum 
22243 R-plasmid pAG1 tnpB 








Escherichia coli tetR 
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1011797 
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1015145 
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1022716 


1019390 
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(nt) 
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Function 




hypothetical protein 


transcription activator of L-rhamnose 
operon 


hypothetical protein 




hypothetical protein 


transcription elongation factor 


hypothetical protein 


lincomycin-production 




3-deoxy-D-arabino-heptulosonate-7- 
phosphate synthase 




hypothetical protein or undecaprenyl 
pyrophosphate synthetase 


hypothetical protein 






pantothenate kinase 


serine hydroxymethyl transferase 


p-aminobenzoic acid synthase 
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milarity 
(%) 




74.1 


55.8 


80.1 




57.1 


60.1 


72.1 


56.3 




99.5 




97.3 


100.0 






79.9 


100.0 


70.1 
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Identity 
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CO 
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57.8 




30.0 


35.0 


34.3 


31.7 




99.2 




96.0 


100.0 






53.9 


99.5 


47.6 




Homologous gene 




Thermotoga maritima MSB8 


Escherichia coli rhaR 


Mycobacterium tuberculosis 
H37Rv Rv1072 




: Streptomyces coelicolor A3(2) 
|SCF55.39 


Escherichia coli greA 


Mycobacterium tuberculosis 
H37Rv Rv1081c 


Streptomyces lincolnensis ImbE 




Corynebacterium glutamicum 
aroG 




Corynebacterium glutamicum 
CCRC18310 


Corynebacterium glutamicum 
(Brevibacterium flavum) 






Escherichia coli coaA 


Brevibacterium flavum MJ-233 
glyA 


Streptomyces griseus pabS 




db Match 




pir:B72287 


sp:RHAR_ECOLI 


pir:F70893 




gp:SCF55_39 


sp:GREA_ECOLI 


pir:G70894 


pir:S44952 




sp:AROG_CORGL 




sp:YARF_CORGL 
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1043774 


1044477 


1046030 


1046390 


1047707 


1046820 


1048501 


1048529 


1049043 


1049068 


1049427 


1051925 


1053880 


1054602 


Initial 
(nt) 


1039996 


1040494 


1040925 


1042027 


1043236 


1043747 


1044295 


1044959 


1045158 


1046073 


1046610 


1047452 


1047827 


1048356 


1048525 


1049385 


1050362 


1050624 


1052021 


1053880 
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NO. 
(a.a.) 
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4597 


4598 


4599 


4600 
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lce protin 










protein 










:ursor 










furization 


furization 
ioxygenase) 


furization 
ioxygenase) 






Function 






phosphinothricin resistar 


hypothetical protein 




, hypothetical protein 


lactam utilization protein 


hypothetical membrane | 






transcriptional regulator 




fumarate hydratase prec 


NADH-dependent FMN 
oxydoreductase 






reductase 


dibenzothiophene desull 
enzyme A 


dibenzothiophene desull 
enzyme C (DBT sulfur d 


dibenzothiophene desull 
enzyme C (DBT sulfur d 
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length 
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58.8 


59.0 
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63.2 
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Identity 
(%) 






30.3 


30.3 




37.8 


30.8 


40.6 






26.0 




52.0 


32.7 






55.4 


39.1 


25,8 


28.9 






Homologous gene 






Alcaligenes faecalis ptcR 


Escherichia coli ybgK 




Escherichia coli ybgJ 


Emericella nidulans lamB 


Bacillus subtilis ycsH 






Bacillus subtilis ydhC 




Rattus norvegicus (Rat) fumH 


Rhodococcus erythropolis 
IGTS8 dszD 






Streptomyces coelicolor A3(2) 
StAH10.16 


Rhodococcus sp. IGTS8 soxA 


Rhodococcus sp. IGTS8 soxC 


Rhodococcus sp. IGTS8 soxC 






db Match 






gp:A01504_1 


sp:YBGK_ECOLI 




sp:YBGJ_ECOLI 
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sp:YCSH_BACSU 
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sp:FUMH_RAT 


gp:AF048979_1 






gp:SCAH10J6 


sp:SOXA_RHOSO 


sp;SOXC_RHOSO 


sp:SOXC_RHOSO 
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1059962 


1060792 


1062146 


1062211 


1064424 


1064478 


1064754 


1065304 


1067570 


1068649 


1069845 


1068913 


1069119 


Initial 
(nt) 


1054859 


1055032 


1055783 


1057200 


1057573 


1057868 


1058598 


1059214 


1059218 


1059360 


1060112 


1060869 


1063629 


1063936 


1064738 


1065200 


1065867 


1066083 


1067570 


1068649 


1069692 


1069808 
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Function 


FMNH2-dependent aliphatic 
sulfonate monooxygenase 


glycerol metabolism 


hypothetical protein 


hypothetical protein 




transmembrane efflux protein 


exodeoxyribonuclease small subunit 


exodeoxyribonuclease large subunit 


penicillin tolerance 


polypeptides predicted to be useful 
antigens for vaccines and 
diagnostics 




permease 




sodium-dependent proline 
transporter 


major secreted protein PS1 protein 
precursor 


GTP-binding protein 


virulence-associated protein 


ornithine carbamoyltransferase 


hypothetical protein 
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length 
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56.4 


66.1 




78.1 


67.7 


55.6 


78.8 


47.0 




63.9 




61.4 


60.0 


88.6 


80.0 


58.8 


69.9 


Identity 
(%) 
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44.3 
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31.3 




36.6 


40.3 


30.0 


50.2 


33.0 




26.3 




30.3 


29.9 


70.1 


57.3 


29.6 


39.2 


Homologous gene 


Escherichia coli K12 ssuD 


Escherichia coli K12 glpX 


Mycobacterium tuberculosis 
H37Rv Rv1100 


Bacillus subtilis ywmD 




Streptomyces coeiicolor A3(2) 
SCH24.37 


Escherichia coli K12 MG1655 
xseB 


Escherichia coli K12 MG1655 
xseA 


Escherichia coli K12 lytB 


Neisseria gonorrhoeae 




Escherichia coli K12 perM 




Rattus norvegicus (Rat) SLC6A7 
ntpR 


Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
17965 cspl 


Bacillus subtilis yyaF 


Dichelobacter nodosus intA 


Pseudomonas aeruginosa argF 


Bacillus subtilis 168 ykkB 


db Match 


gp:EC0237695_3 


sp:GLPX_ECOLl 


pir:B70897 


pir:H70062 




gp:SCH24_37 


_j 

o 

o , 

LU 

1 

(f) 

X 
LU 
d_ 

CO 


sp:EX7L_ECOLI 


sp:LYTB_ECOLI 


GSP:Y75421 




sp:PERM_ECOLI 




sp:NTPR_RAT 


sp:CSP1_CORGL 


sp:YYAF_BACSU 


sp:VAPI_BACNO 


sp:OTCA_PSEAE 


sp:YKKB_BACSU 




1176 


CO 
CD 
CD 


o 

LO 


1902 


LO 
CO 
CN 


LO 
CN 
CM 


CO 
CN 


1251 


LO 

co 


CO 
CM 


CO 
CM 
CO 


1320 


o 

CO 


1737 


1233 


1083 


h- 

CO 
CM 


CM 
CM 
CO 


o 

LO 


Terminal 
(nt) 


1071134 


1071479 


1073245 


1073340 


1075641 


1075329 


1075667 


1075933 


1078271 


1077306 


1078319 


1079221 


1080786 


1080972 


1082951 


1085462 


1086087 


1086917 


1087044 


Initial 
(nt) 


1069959 


1072441 


1072676 


1075241 


1075357 


1075553 


1075909 


1077183 


1077297 


1077734 


1079146 


1080540 


1080965 


1082708 


1084183 


1084380 


1085791 


9609801 


1087544 


SEQ 
NO. 
(a.a.) 


4635 


4636 


4637 


4638 


4639 


4640 


4641 


4642 


4643 


4644 


4645 


4646 


4647 


4648 


4649 


4650 


4651 


4652 


4653 


SEQ 
NO. 
(DNA) 


LO 
CO 


CD 
CO 


r- 

co 


CO 
CO 


CO 
CO 


o 
xr 




CN 


CO 




LO 


CO 




CO 


CO 


o 

LO 


LO 


CN 
LO 


CO 
LO 



-189 - 



o 



0) 

o 



1 Si 

^ CD 

O O 

C 3 

CD CD 

CD O 



CO 

0) 
co 
CO 



O 
CL 

co 



cr 
o 



(0 

co t— 
O CO 
Q_ go 



CO 



(0 
CO 

o 

Q_ 
CO 

c 

CD 



CO 
CO 
O 
CL 
CO 

c 

(0 



CO 
I 

CD 

.E <D 

frS 

c TO 

S.E 

CD <U 

- CO 



CO 



CD 



oj £ « 

X <D g 



o 

o 
c 

8 o> 

3 co 

I* 
1 1 

CO g 

"a 



o 

CD >■» 

2 o 

QL XI 



3 o 
"o c 

CD 

CD £_ 
C 
CD 
cd c 

"o ^ 

-1 

CD > 



TO 

CO -C ^ 
CD CD - — * 



00 
CD 



CD 
CD 
CO 



I s - 
CD 



LO 



GO 



co 

CN 



co 
o 



CD 



CO 



o 

CD 



CO 



co 

CD 



CD 



lo 

CD 



CD 
CD 



CO 
CD 



CO 
CO 



CO 
CO 



CN 
CN 



CN 
CO 



CM 
CD 



co 



I s - 
co 



co 

CO 



CO 



CD 



05 
.CD 



CD 
CD 
CO 
=J 
O 
CD 

_o 

O 

E 
o 
X 



X 
Q 
Ol 



E 



o 
o 

"55 
o 
o 

CO 
CD 
O 

>>o 
E ^. 

O CO 

UO 

CD CO 

-S= O 
co co 



o 
E 

CO 
CD 



CD ® 
XI 

o h- 
O < 



ii 

ii 

<D CD 

O o3 co 
— -+ - co 

CO O CO 
XI CO ^ 
CD XJ 

c > O 

fD O 



11 

M 
ii 

CD -r— CD 
CD XJ 

c > O 

<3s^ 



o 



CO 

'■5 

CL 

to 

CO 

c: 
o 

E 
o 

"O 



CO 
c: 



3 

> 



CD 
O 

E 
o 



3-s 



X) 

■o 



CO 

CO 
CN 
CO 



CL 
CD 



o 
o 
a: 
f— 

CO 



CO 

> 

CL 
CO 



CD 
CO 

co 



CN 

I s - 
O 



O 
—> 

CL 



ZD 
Q_ 
LLi 
CO 
CL 

<' 

01 

o 



< 
o 

o 
< 

o 

O 

a 



CN 

o 
co 
co 
lo 
o 



o 

CO 
CD 



CO 

o 

CN 



CM 



CD 
CM 



lo 
co 



CO 
CO 



CM 
CO 



LO 
CD 



LO 
CD 



CD 

c 

I : 

CD 



CD 
CD 

r>- 
co 
o 



ID 
CO 
CO 

o 



I s - 

CO 

co 

LO 
CD 

o 



lo 

CD 
O 



CO 
CO 

CO 
CD 
O 



CM 
CD 

in 

CO 
CD 

o 



CD 
CN 
CD 
CO 
CD 
O 



O 

in 
I s - 

CD 
CD 
O 



O 
CD 
CD 
O 



CD 
CD 
O 



CO ; 



CO 
CD 
CN 
CO 
CO 
O 



o 

CD 

oo 
o 



r- 

CD 

in 

CD 

o 



CO 
CD 

o 

CO 
CD 

o 



CO 
CO 
CO 
CD 

o 



o 

LO 

I s - 

CD 

o 



CD 

o 

CD 
CO 
CD 
O 



CD 
O 
CN 
CD 
CD 
O 



CO 
CO 
I s - 
CD 
CD 
O 



2§ 2 

CO ^ ^ 



m 
co 



to 

LO 
CD 



O 
CO 
CO 



CD 
CD 



CN 
CD 
CD 
^3" 



CO 
CD 
CD 



CD 
CO 



CD 
CD 
CD 



CD 



o n < 

W2Q 



LO 



LO 
LO 



O 
CO 



CM 
CO 



CO 
CO 



I s - 
CD 



CD 
CD 



O 



-1 



90 - 



o 



ID 



IB 



CO 

I* 

CO 



"D 



CD 



e 
o 
o 



,C0 



O) 



O 
CD 

O 

£ 
o 



o 
to 



T3 



E 

CD 



So - 

co 2: «. 



C ri < 



X 

o 



O 
X3 



CO 
CD 

to 



CO 



co 



CN 

co 

O 
O 

CL 



o 

a) o 
>-» o 

CO (D 



"3" 

co 

CN 

CD 

m 
ZD 
Gl 
co 

CL 
CO 



CO 



CO 

in 

CD 



CD 
CD 
O 



1^- 



c 

CD 

"o 

CL 



O 
CL 



in 
in 
co 



o 
co 



E 05 

8 > 

o DC 
o r-> 

CO 

^ X 



CL 
</> 



CD 
IO 
CO 



CO 

o 



o 



CO 

m 
o 

CD 
O 



r- 

CD 



3 
Cfl 

CD 
CO 
CO 

i5 
o> 
_c 
o 

E 



CD 
CO 

E 



CO 
CN 
CO 



CN 

in 



CN 



O 
O 



CD 
O 

<u 

CO 

-C 

CL 
CO 
%— 

CD 

T) sz 
co o 

-2 <^ 

^ CN 
£ ° 



X 
CO 

O 
X 
Ql 

_l 
X 

o 

CD 

CL 
CO 



CD 
CO 
CN 



CO 
CO 
O 
CO 
O 



CO 
CO 

o 



CO 
CO 



CO 



-2 
"3 

c E 

CD Q) 

•H S 

St 

CN CL 



O 
CO 



CN 
CD 



CO 
CO 



E 

CD 



O 

c 

CO 
"CD 

E 



CL 
O 

to 

o 
o 

E 
< 



CO 

o 

CO 
CO 



CL 
CO 



CN 
CD 



O 
CN 
CO 
O 



O 
CO 

in 



CO 

co 



co 



c 

CP 
CL 



o 

CL 



CN 
CD 
CN 



o 

CO 



CO 

co 



CD 
-Q 

3 

o 

8 > 

O > 

o a: 

o 

>>co 
^ X 



r-- 
m 
o 

< 



m 
o 



m 
o 

CO 

co 
o 



o 

CN 
CO 

o 



o 

CO 
CO 



o 

CO 



CL 



CD 

g CD 
o CO 

4 1 

g- o 



CO 
CN 



CD 

in 



CD 
CN 



CL 
O 

o 



co <r 
<d B_ 

CD 
CL CN 
CD t— 
±Z U_ 
CO CO 



< 

CL 
O 

m 



co 



CN 

co 



10 
r-- 

CD 
O 



CO 
CD 
CO 
CO 

o 



CO 
CO 



CO CD 



& CL 



CO 
CO 

in 



10 



co 



CJ 

CD 
CO 

CO 



E 
o 

CD 
CO 



a: 

LL. 

I— 

co 

1 

O 
ct: 



CD 



CN 
CO 



CN 
CD 

co 
o 



CN 
CO 
CD 



CD 
•4— < 

o 

CL 

"co 

O 



O 
CL 



CO 
CO 



CO 
CO 



CD 
CN 



CD 

^ CJ 

E 00 

»^ 
8 > 

_Q > 

o a: 

>» co 
S X 



O 



o 

co 
o 

>- 



CO 
CD 
CO 



in 

CN 



o 

CN 
CO 



CO 
CO 
CD 



co 
co 



CD 
O 



CD 
CO 

"El 

a> 

to 
c 
o 

xz 

CL 



O 

CN 
CO 



in 
in 



m 
in 
co 

5 



CN 

5 



CO 

o 

*k_ 

-C < 

o a 



O 
O 
LU 



X 

CL 



CN 
CO 



O 
CO 
CN 
CN 



CD 
CO 

CO 



CO 
CD 



GO 



CL 

X 
3 

CD 

CD 
O 

tz 

CO 



CO 

ZD 



3 

E 



CD 

CO 



CD 

in 



CN 
CN 



^ E 

CO CL 



r-- 
co 

CO 

CL 
CO 

CL 
CO 



CO 
CN 



CN 
O 



CO 
CO 
CO 



CD 
GO 



a> 
u 
c 

CD 
3 
XT 
CD 
CO 

cz 
o 

'tz: 

CD 
CO 
CZ 



S CO 
CL CO 



2 co 



CO 
CO 



CO 
CO 



CD 
CD 



I 

ii 

d 3 

<D d - 



CO o 



CD _Q 

cd a 



CO 
CO 

CO 



co 
o 

CO 



CO 
CD 

m 



CO 
CO 



CO 
CO 

"3- 



CO 



-19 1 - 



o 
c 



CD 
CD 



CD 

"O Q) 

V- CO 
O m 

o 

3 O 
C X: 

' cl 
o> to 

It 

o o 

C Q- 



i5 
o 



< 

Q 



O 
Q_ 

a> 
c 

CD 
k_ 
XI 

E 

E 

co 
o 

o 



C 

2 

CL 



O 
CL 



0) 
Q. 



O 



C 
CD 

"S 

CL 
CD 

CD _ 
3 > 

O en 
-C CD 
CL >, 
CO — 

Q-O 

TO TO 



<D t- 

°£ 

CL O 
^ Q_ 

O C 

9* <o 

CO w_ 

c 

CO CD 
*— "ti 
CO 

o o 

C N 

co *= 
»— (D 
-Q x» 

CD g 



CO 

_co 

$? 

o 

xi 
>» 
x: 

0J CO 
CD CD 

I s s 

1 a s 

2 £S 

"O o o 
Cl X: E 



a> 

~CO 
O 



O 
CL 



*o 

Q_ 
CD 
C 
CD 
k— 
Xl 

E 

CD 

E 

CO 

o 

'-*—> 
CD 
X: 
-♦— < 

o 



£= CO 
CD CD — ' 



CD 
CO 



CO 
00 
CM 



m 

CO 
CN 



CN 
CD 



00 
O 



CO 



o 

CN 



ID 
CD 
CO 



<o 

CO 
CN 



co 



CO 



co 

CD 



o 

CD 



in 



CD 
CD 



"3" 



o 

CD 



CO 



CO 
CD 



in 



CD 
T3 



CD 
CO 



co 



CO 
CN 



CO 
CO 



a> 

CN 



co 

CN 



CO 

o 



CO 
CN 



CD 

r--' 

CN 



03 



o 
o 



.a 

,C0 



CD 
CD 



O 

cd 
_o 

o 

E 
o 

X 



§ s> 

g * 

^ CO 

CD CO 

ro ^ 
5= 

co -9- 

3 =3 

o to 

O CD 

o -a 

U 

CD 

f- -*-• 



_o 

3 
O 

CD 
XI 
3 



CD 



O 



CD 



E P 

O CO 

CD LO 

i= O 

CO CO 



to 
c: 

CO 

L_ 

3 

-o 
o 

CO 



O CN 
O <<- 
O 

cd cr 

Q Q 



<D 
O 
U 
CO 
CD 
O 

>->CO 
E <=> 

cl< 

CD CO 

-fc= O 

CO CO 



CD 

-C LL 

CO _Q 

UJ >, 





CD 


IpIA 


phn 


CN 


CN 












~o 


o 


O 


CD 


CO 


JZ 


JZ 


o 


o 


CD 


CD 


SI 


-C 


o 


o 


CO 


CO 


LU 


LU 



CO 

o 

CL 
CO 

■g 

CL 
CO 
CO 

c 
o 

E 
o 

T3 
3 
CD 



JZ 
CL 



o 

c: 

O) 

3 

CD 
CO 



O 

E 
o 
■o 

3 
CD 



CD 



LU 



CO 
CO 
1— 

o 



CO 
CO 
3 

O T~ 

O CO 

O T- 

o m 
>*< 

CL Q_ 



XI 
3 
CO 



CO 



CN 
LO 
x — 
CO 

< 

LL 
CL 



ID 

a 



o 

Q 
< 



co 
CO 

LO 

o 

CO 
CL 
O) 



o 
o 

LU 

< 

CL 
CO 



< 

CO 

O 
CO 

CL 
CD 



o 
o 

LU 

u_' 
Q 
CO 
> 



O 

a 

LU 

i 

CQ 

X 
CL 



3 
CL 
LU 
CO 
CL 

< 

o 

Q. 



UJ 
< 
LU 
CO 
CL 

>' 

X 
X 
CL 
CL 



CO 

a 
< 

CO 



o 
o 
in 
r- 

a 



St 



o 



co 

CO 



o 
o 

CO 



o 
o 

CO 



CO 
CD 
CN 



in 

co 



co 
o 



co 

CN 



E 

CD 



CN 
CO 
CO 
LO 



CO 
O 
CD 
CO 



CD 
CO 
O 
CD 



O 
CO 
O 
CN 



CO 
CO 
CO 
O 
CN 



CO 
CD 



co 
co 

CN 



CO 

in 
co 

CN 



CD 
CO 
CO 



CO 

CN 



o 

CO 



in 
o 

CD 
CD 



CN 
CO 



LO 
O 
CN 
O 
CN 



CN 
CO 
^3" 



C0 

o 

CO 



LO 

o 

CO 
CN 



CO 
CN 
CO 



O 
CN 
O 
CD 
CN 



CN 

o 

CO 



CO 
CN 



CN 
CO 



So - 

GO ^ ^ 



CO 
CO 
CD 



CD 
CO 
CD 



co 



CN 
CD 
CO 



CO 
CD 
CD 



CD 
CD 



CO 
CD 
CD 



CD 
CD 



CO 
CD 
CD 



o 



LO 

o 



co ^ a 



CO 
CO 



CO 
CO 



CN 
CD 



CO 
CD 



CD 



CO 
CD 



CD 



CO 
CD 



o 

CN 



in 
o 

CN 



-192 - 



CO 
"O 
X 

o 

CD 
CL 



X 

o 

"O 

CO 



c 
E 

CO 

cd ^~ 

i° to 

O (ft 
3 CD 

o 

V 0 



CD 



3 



cu 

CO 
CO 

t> 

T3 
CD 

0) 

"CO 

c S 

CD "O 
CO O 

™ E 



E 

*CD 



O 
CL. 



c 

"03 

"o 



O 



CO 

o ^ 

>,< 

^ c 

C CD 
CD o 

o> £ 

.£ ro 

= f 



c 
"o 

Q_ 

75 
o 



o 

Q. 



c 
"o 

75 
o 



o 



CO 

1 



X 

o 

CD 



CD JZ „ . 

"S) cd 

CO CD — * 



CD 



CN 
CO 
CN 



CN 
CN 



^3- 
CO 



CD 
O 



to 

CO 



CO 

o 



EC 

CO 



co 



o 

CD* 



CN 



CN 
CO 



CD 



CD 



CO 
CD 



CN 
CO 



CO 



LO 
CO 



CO 
CD 



CN 



CO 
CO 



CD 
CO 



CD 
3 
C 



o 
o 



_CD 
.CO 



CD 
C 
CD 
O) 



O 
CD 

O 

E 
o 
X 



£ x 
co 

_Q > 

o o: 

o r-. 
>>co 



CO 

< 



CD 

o 



E 
o 



CO 



o 

CD 



LU 



CD 

Z3 

o 

P CN 

§ CO 

CD ^ 

£ > 
o cr 

o 

>> co 
^ X 



CD 
JQ 
Z3 

E ^ 

|j > 
o C£ 
co 

r> > 

o cr 

o h- 
>*co 



< 
a. 
>^ 

CM 

"5 
o 

co 
j= 
o 

CD 

o 
co 
LU 



CD 
jQ 
Z3 

-4— ' 

IS 

s> 
8 > 

o q: 

o 

>*co 
^ X 



CD 



-2 > 

o q: 

co 

o > 

o a: 

o r- 
>> co 
S X 



CO 
CD 
O 

E 
o 

CL. 
CD 

CO 



"co 



3 
1— 
O 



CL 



CD 

r»- 

LL 

O 
CO 

d 



O 

a 

LU 

a 
< 

CL 



CD 
LO 

in 
o 



m 
in 
in 
o 



O 
O 
LU 

<' 

Q_ 
> 
h- 
io_ 
co 



CO 

o 



m 

co 
o 

cn 



O 
cr 
f— 
co 

l 

a: 



in 

CD 



o 
o 

CD 



in 
co 



co 
m 



co 
o 
in 



o 
co 



E c 
i— — 

CD 



in 
in 
o 
in 

CO 



CD 

in 
co 
co 
co 



o 

CD 

o 



CD 
CN 



co 

CN 
O 
CO 



CO 
CN 
O 
CO 



CN 
O 
CO 



co 



r-- 
to 

CN 
CD 



CO 

"E 



CO 

in 

CO 



o 

CD 
CD 

CO 



in 

CN 



O 
CO 



a> 

CO 

co 



CD 
O 
CD 



CN 
CD 

in 



co 
in 

CD 

co 



So ? 
co ^ S 



o 



a> 

T — 



o 

CN 



CN 



CN 
CN 



CO 
CN 



m 

CN 



2 ii 

co 2 Q 



O 
CN 



CN 



CD 
CN 



O 
CN 
CN 



CN 
CN 



CN 
CN 
CN 



CO 
CN 
CN 



m 

CN 
CN 



-193 - 



c 
o 



o 



CD -C= ^ 

its 

CO CD ' 



o 

CO 

>*<o 
o a> 

-g'S 

2.9- 



o 
o 
"a. 

O 



CD 

^ o 



o 



CO 



'8 <° 

2 In -5 



CM 
CM 



C 
CD 

CL 



O 
CL 



CD 
CO 
CO 

x: 

"c 
>% 
to 



(0 

o 

s 



CO 
CN 



CD 

CL 

*C0 
O 



O 
CL 



CO 
CO 



> O 

1 E 

CD £= 

CO CO 

3 CD 



I— cD ° 

C > 3 
§ CD O 

.9? o_ cd 
c *- -° 



CD 
O) 
CD 
O 



O 

E 



E 



CO 
GO 
CM 



— 0) 

CO CO 

C CO 
Q) 

*S 

CO £ 

■ CO 



CO CO 

•a 8 

s ^ 

3 CD 

< CO 



CO 
CO 



CO 



CO 



CD > 



CO 

§» 

O) CO 



o 
o 



C0 
w_ 
^CD 
CO 



CD 

E 



CO 



T3 
C 
CO 

u 
o 



"2 > 

CO '.£3 

E <° 
*n g 



CO 

!* 

CO 



o 
o 



o 
o 



CO 
CO 



CO 



LO 
CO 



CO 



CM 
CO 



o 
o 



o 
o 



CO 

to 



CO 
CO 



CM 



CO 
CO 



CM 



00 
CO 
CM 



CM 



"O 
CD 



.CO 



o 

CO 

o 

E 
o 
X 



o 
"E 

CO 

COQ 

E ^ 

ZD 



CO 



CD CM 
CD 

CO 

o h- 
O < 



o 
E 

CO 
CO 

.2 o 

CD CM 

_g <*> 

CD 

CO 

o h- 
O < 



CD 
O 

E 
o 

£-co 



CO 



CD 
-O 
ZJ 

§ o 
> 

o C£ 
co 

o > 

o a: 

o r^- 

CO 
2 X 



CD 

t5 

CO 
JO 

o 



CO 

-g 

=j 
o 

CD 
CO 
'CI 
O) 
CO 

k — 

CL 
CO 
O 

c 
o 
E 

2 < 
.52 >, 
^ E 



CD 



-5 < 

to o> 
LU co 



CD 

o 



8-a 

tO CO 



o 

CO 

o 
>» 

E 

to 

CD 
O 

£ 



CD tZ 
±- TD 

CO ^ 



LU 
O 
CL 



LU 



CO 
CO 

CD 
O 



co 
o 
o 

CO 
CO 



CO 
CL 

a 

CO 

CL 
CD 



CO 
O 
CO 

o 
O 



CM 
CO 



a 
a 



< 
> 



O 
o 

LU 
<' 

CL 



O 

o 

LY 

1- 

o 
o 

CL 



h- 
C0 



O 

o 

LU 

I 

LU 

O 

CL 

a: 

CL 
CO 



CO 
CO 



CO 

to 



co 
co 



co 
o 

CO 



LO 
CO 



CO 
CO 



CM 
CM 



CO 
CO 
CO 



CO 
CO 

to 



E 

I — ' 
CD 



CO 

co 

CM 
LO 



CO 
CO 

to 
r-- 

LO 



CM 
LO 
CO 
LO 



CM 

LO 
CO 
LO 



CO 
CO 

r^- 

C0 

LO 



CO 
CM 

o 

CO 



CO 

r^- 
co 

CM 
CO 



co 

CO 



CO 
CO 

to 

CO 



to 
o 

to 



CO 

to 

CM 
CO 
LO 



CM 
O 
CO 
CO 
LO 



CO 
CO 

LO 



co 

CM 
CO 
LO 



LO 
CO 

to 

CO 
LO 



LO 

to 

CO 
CO 
LO 



LO 

o 
to 

CO 

to 



CM 

o 
h- 
co 
to 



CO 
LO 
CD 



LO 

to 



So <d 

co ^ SB, 



CO 
CM 



CO 



CM 
CO 



co 



LO 
CO 



to 
co 



co 

CO 



CO 
CO 



o 



CM 



O r< < 
co ^ Q 



CO 
CM 
CM 



CO 
CM 



CM 
CO 
CM 



CO 
CM 



LO 
CO 
CM 



CO 
CO 
CM 



CO 
CO 
CM 



CO 
CO 
CM 



O 
CM 



-194 - 



o 
o 



CD 



Function 


hypothetical protein 


ATPase 


hypothetical protein 


hypothetical protein 


hypothetical protein 






2-oxoglutarate dehydrogenase 


ABC transporter or multidrug 
resistance protein 2 (P-glycoprotein 
2) 


hypothetical protein 


shikimate dehydrogenase 


para-nitrobenzyl esterase 








tetracycline resistance protein 


metabolite export pump of 
tetracenomycin C resistance 




Matched 
length 
(aa) 


CM 


to 
CM 


LO 


CO 


o 






1257 


1288 


o 

CM 


to 
to 

CM 


o 
to 








cn 
o 






milarity 
(%) 


73.2 


72.0 


83.8 


77.0 


87.1 






99.8 


60.4 


72.1 


61.2 


64.7 








61.4 


64.2 




CO 






































Identity 
(%) 


45.5 


43.6 


60.4 


49.8 


57.9 






99.4 


28.8 


31.7 


25.5 


35.7 








27.1 


32.4 




Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv1224 


Escherichia coli mrp 


Mycobacterium tuberculosis 
H37Rv Rv1231c 


Mycobacterium tuberculosis 
H37Rv Rv1232c 


Mycobacterium tuberculosis 
H37Rv Rv1234 






Corynebacterium glutamicum 
AJ12036 odhA 


Cricetulus griseus (Chinese 
hamster) MDR2 


Mycobacterium tuberculosis ! 
H37RvRv1249c 


Escherichia coli aroE 


Bacillus subtilis pnbA 








Escherichia coli transposon 
Tn1721 tetA 


Streptomyces glaucescens tcmA 




db Match 


pir:C70508 


sp:MRP_ECOLI 


pir:B70509 


pir:C70509 


pir;A70952 






prf:2306367A 


sp:MDR2_CRIGR \ 


pir:H70953 


sp:AROE_ECOLI 


sp:PNBA_BACSU 








_i 
o 
o 

LU 

o 

\— 

CL 
CO 


sp:TCMA_STRGA 




ORF 


CO 
CO 


1125 


CD 

h- 

LO 


1290 


CO 

to 


CD 
CO 
CO 


o 
to 


3771 


3741 




o 

CO 


1611 


to 

CD 


CD 

co 


to 

CM 

to 


1215 


1347 


to 

O 


Terminal 
(nt) 


67577 


67587 


I68747 


169321 


171187 


171871 


171869 


172501 


1 76308 


180121 


180872 


183603 


184257 


185155 


185218 


187039 


188389 


190526 






































Initial 
(nt) 


1167110 


1168711 


1169325 


1170610 


1170672 


1171206 


1172462 


1176271 


1180048 


1180837 


1181675 ' 


1181993 


1183607 


1184280 


1185742 


1185825 


1187043 


1189822 


SEQ 
NO. 
(a.a.) 


4743 


4744 


4745 


4746 


4747 


4748 


4749 


4750 


4751 


4752 


4753 


4754 


4755 


4756 


4757 


4758 


4759 


4760 


Hi U z 


CO 
«T 
CM 


CM 


to 

CM 


CO 
CM 


r-- 

CM 


CO 

CM 


cn 

CM 


o 
to 

CM 


to 

CM 


CM 

to 

CM 


CO 

to 

CM 


to 

CM 


to 
to 

CM 


CD 

to 

CM 


h- 
to 

CM 


CO 

to 

CM 


CD 

to 

CM 


o 

CD 
CM 



-1 9 



5 - 



=3 



CD 



Function 


5- 

methyltetrahydropteroyltriglutamate- 
-homocysteine S-methyltransferase 




thiophene biotransformation protein 












ABC transporter 


ABC transporter 


cytochrome bd-type menaquinol 
oxidase subunit II 


cytochrome bd-type menaquinol 
oxidase subunit I 


helicase 




mutator mutT protein ((7,8-dihydro- 
8-oxoguanine-triphosphatase)(8- 
oxo-dGTPase)(dGTP 
pyrophosphohydrolase) 




proiine-specific permease 


Matched 
length 
(aa) 


















CO 
CN 
LO 


LO 
LO 


CO 
CO 
CO 


CN 
LO 


CN 

o 




CO 
CD 




CO 
CO 


Similarity 
(%) 


72.2 




79.5 












63.5 


"3" 

cri 

LO 


93.0 ' 


99.0 


55.0 




65.6 




85.0 


Identity 
(%) 


45.2 




55.2 












28.7 


29.4 


92.0 


99.6 


26.4 




CD 

CO 
CO 




51.3 


Homologous gene 


Catharanthus roseus metE 




Nocardia asteroides strain KGB1 












Escherichia coli K12 MG1655 
cydC 


Escherichia coli K12 MG1655 
cydD 


Corynebacterium glutamicum 
(Brevibacterium lactofermentum) 
cydB 


Corynebacterium glutamicum 
(Brevibacterium lactofermentum) 
cydA 


Escherichia coli K12 MG1655 
yejH 




Proteus vulgaris mutT 




Salmonella typhimurium proY 


db Match 


pir:S57636 




gsp:Y29930 












sp:CYDC_ECOLI 


sp:CYDD_ECOLI 


N i 

CO 
CO 

o 

LO 

CO 

o 

CD 

< 

CL 
CO 


co 1 

CO 

o 

LO 
CO 
O 
CO 

< 

CL 
CD 


sp:YEJH_ECOLI 




sp:MUTT_PROVU 




sp:PROY_SALTY 


ORF 

l D PJ 


2235 


CO 
LO 


1398 


CN 
CO 


LO 
CD 


CN 
CD 

r- 


1647 


CN 
CD 


1554 


1533 


CD 
CD 
CD 


1539 


2265 


CN 
«T 
CO 


CO 
CD 
CO 


LO 
CO 

r^- 


1404 


Terminal 
(nt) 


1188388 


1191542 


1193807 


1194190 


1195109 


1195125 


1197620 


1197815 


1197990 


1199543 


1201090 


1202094 


1203916 


1206657 


1206831 


1208138 


1208212 


Initial 
(nt) 


1190622 


1191087 


1192410 


1193867 


1194165 


1195916 


1195974 


1197624 


1199543 


1201075 


1202088 


1203632 


1206180 


1206316 


1207223 


1207374 


1209615 


S§ 3 

CO 2 «, 


4761 


4762 


4763 


4764 


4765 


4766 


4767 


4768 


4769 


4770 

I 


4771 


4772 


4773 


4774 


4775 


4776 


4777 


SEQ 

NO. 
(DNA) 


1261 


1262 


1263 


1264 


1265 


1266 


1267 


1268 


1269 


1270 


1271 


1272 


1273 


1274 


1275 


1276 


1277 



-1 



96 



CD 



Function 


DEAD box ATP-dependent RNA 
helicase 


bacterial regulatory protein, tetR 
family 


pentachlorophenol 4- 
monooxygenase 


maleylacetate reductase 


catechol 1,2-dioxygenase 




hypothetical protein 


transcriptional regulator 




hypothetical protein 


phosphoesterase 


hypothetical protein 






esterase or lipase 






Matched 
length 
(aa) 


CO 
CO 


r-- 

CM 


m 

CD 

to 


in 

CO 


CO 

h- 

CM 




in 

CO 

T — 


CO 

h- 
co 




CO 

o 

CM 


in 

CD 
CO 


in 

CD 






o 

CN 
CN 






milarity 
(%) 


74.3 


47.4 


47.7 


72.0 


59.4 




58,4 


55.4 




56.2 


67.3 


59.6 






64.6 






CO 




































Identity 

(%) 


48.1 


24.7 


24.5 


40.4 


CO 

o 

CO 




31.9 


CD 
CM 




CO 
CD 
CM 


39.2 


29.7 






37.3 






Homologous gene 


Klebsiella pneumoniae CG43 
DEAD box ATP-dependent RNA 
helicase deaD 


Mycobacterium leprae 
B1308_C2_181 


Sphingomonas flava pcpB 


Pseudomonas sp. B13 cIcE 


Acinetobacter calcoaceticus 
catA 




Mycobacterium tuberculosis 
H37Rv Rv2972c 


Saccharomyces cerevisiae 
SNF2 




Streptomyces coelicolor A3(2) 
orfZ 


Mycobacterium tuberculosis 
H37RvRv1277 


Mycobacterium tuberculosis 
H37RvRv1278 






Petroleum-degrading bacterium 
HD-1 hde 






db Match 


sp:DEAD_KLEPN 


prf:2323363BT 


CO 

CO 
LL 

m' 

CL 

o 

0. 

Q. 

to 


sp:CLCE_PSESB 


sp:CATA_ACICA 




^1^70672 


i— 

CO 

< 

LU 
> 

cn' 

LL 

CO 
Q. 

CO 




gp:SCO007731_6 


pirE70755 


1— 
O 
> 

CO 

o 

>- 

cL 

CO 






gp:AB029896J 








2196 


CO 
CO 


1590 


1068 


in 

CO 
CO 


^ 


o 
m 


3102 


1065 


CO 

in 

CO 


1173 


2628 


CO 

o 

CO 


CO 

T — 

CO 


r^- 


CO 

co 


CD 
CO 

h- 


Terminal 
(nt) 


1212129 


1212429 


1214858 


1215938 


1216836 


1216904 


1217443 


1222996 


1221841 


1223843 


1225059 


1227693 


1227282 


1227340 


1228636 


1229095 


1229935 


Initial 
(nt) 


1209934 


1213115 


1213269 


1214871 


1215952 


1217374 


1217982 


1219895 


1222905 


1222986 


1223887 


1225066 


1227587 


1227657 


1227863 


1228718 


1229150 


SEQ 

NO. 
(a.a.) 


4778 


4779 


4780 


4781 


4782 


4783 


4784 


4785 


i 4786 


4787 


4788 


4789 


4790 


4791 


4792 


4793 


4794 


SEQ 
NO. 
(DNA) 


CO 

h- 
CN 


CD 

r-- 

CN 


o 

CO 
CM 


CO 
CM 


CM 
CO 
CN 


CO 
CO 
CN 


co 

CN 


LO 
CO 
CN 


CO 
CO 
CM 


r-- 

CO 
CM 


CO 
CO 
CN 


CD 
CO 
CN 


o 

CD 
CM 


CD 
CM 


CM 
CD 
CM 


CO 
CD 
CM 


CD 
CM 
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05 
-Q 



Function 


short-chain fatty acids transporter 


regulatory protein 






fumarate (and nitrate) reduction 
regulatory protein 


mercuric transort protein periplasmic 
component precursor 


zinc-transporting ATPase Zn(ll)- 
translocating P-type ATPase 


GTP pyrophosphokinase (ATP:GTP 
3-pyrophosphotransferase) (ppGpp 
synthetase I ) 


tripeptidyi aminopeptidase 






homoserine dehydrogenase 






nitrate reductase gamma chain 


nitrate reductase delta chain 


nitrate reductase beta chain 


hypothetical protein 


hypothetical protein 


nitrate reductase alpha chain 


nitrate extrusion protein 


Matched 
length 
(aa) 


CM 
CN 


CO 

co 






CO 
CN 
CN 


CO 


in 
o 

CD 


CO 


o 

CD 






CN 






o 

CN 
CN 


in 
x — 


in 
o 
in 


CO 


CO 
CO 


1271 


CD 


Similarity 
(%) 


69.7 


56.6 






57.9 


66.7 


70.6 


58.4 


49.3 






98.0 






69.6 


63.4 


83.4 


48.0 


55.0 


73.8 


67.9 


Identity 
(%) 


37.7 


24.7 






25.0 


33.3 


38.0 


32.9 


26.6 






95.0 






45.0 


30.3 


56.6 


36.0 


36.0 


46.9 


32.8 


Homologous gene 


Streptomyces coelicolor 
SC1C2.14catoE 


Erwinia chrysanthemi recS 






Escherichia coli K12 MG1655 fnr 


Shewanella putrefaciens merP 


Escherichia coli K12 MG1655 
atzN 


Vibrio sp. S14 relA 


Streptomyces lividanstap 






Corynebacterium glutamicum 






Bacillus subtilis narl 


Bacillus subtilis narJ 


Bacillus subtilis narH 


Aeropyrum pernix K1 APE1291 


Aeropyrum pernix K1 APE1289 


Bacillus subtilis narG 


Escherichia coli K12 narK 


db Match 


sp:ATOE_ECOLI 


X 

o 

5 

OL 
LU 

I 

CO 

o 

LU 
Q_ 
d. 

CO 






sp:FNR_ECOLI 


|sp:MERP_SHEPU 


sp:ATZN_ECOLI 


sp:RELA_VIBSS 


gsp:R80504 






GSP:P61449 






sp:NARI_BACSU 


=) 

CO 

o 
< 

CO 

1 

—>■ 
(Z 

< 

ZL 

CL 

CO 


sp:NARH_BACSU 


PIR:D72603 


PIR:B72603 


sp:NARG_BACSU 


sp:NARK_ECOLI 


ORF 

t D P) 


r^- 
co 
in 


CD 
CO 
T 


CN 
CN 
CN 


CD 

m 


o 
m 
r- 


co 

CN 


1875 


o 

CO 
CD 


1581 


CO 

o 

CD 


o 

CN 


CO 

o 


1260 


o 

CD 
CO 


h- 


CN 
CO 

r*- 


1593 


CD 

in 


CO 
CN 


3744 


1350 


Terminal 
(nt) 


1229180 


1230480 


1230831 


1230914 


1232479 


1232836 


1234881 


1235612 


1236545 


1241554 


1242156 


1243728 


1243942 


1244843 


1245720 


1246508 


1247199 


1250444 


1251817 


1248794 


1252557 


Initial 
(nt) 


1229716 


1229995 


1230610 


1231432 


1231730 


1232603 


1233007 


1234983 


1238125 


1242156 


1242275 


1243621 


1245201 


1245532 


1246496 


1247239 


1248791 


1249851 


1251545 


1252537 


1253906 


SEQ 
NO. 
(a.a.) 


4795 


4796 


4797 


4798 


4799 


4800 


4801 


4802 


4803 


4804 


4805 


4806 


4807 


4808 


4809 


4810 


4811 


4812 


4813 


4814 


4815 


LU § Z 


in 

CD 
CN 


CO 
CO 
CN 


i^- 
co 

CN 


CO 
CD 
CN 


CD 
CD 
CN 


o 
o 

CO 


o 

CO 


CN 
O 
CO 


CO 

o 

CO 


o 

CO 


in 
o 

CO 


CO 

o 

CO 


h- 
o 

CO 


CO 

o 

CO 


CD 
O 
CO 


o 

\ — 
CO 


CO 


CN 
CO 


CO 
CO 


CO 


in 

CO 



-198 - 



Function 


molybdopterin biosynthesis cnxl 
protein (molybdenum cofactor 
biosynthesis enzyme cnxl ) 


extracellular serine protease 
precurosor 




hypothetical membrane protein 


hypothetical membrane protein 


molybdopterin guanine dinucleotide 
synthase 


molybdoptein biosynthesis protein 


molybdopterin biosynthsisi protein 
Moybdenume (mosybdenum 
cofastor biosythesis enzyme) 


edium-chain fatty acid-CoA ligase 


Rho factor 








peptide chain release factor 1 


protoporphyrinogen oxidase 




hypothetical protein 


undecaprenyl-phosphate alpha-N- 
acetylglucosaminyltransferase 


Matched 
length 
(aa) 


lo 


CO 
CO 

h- 




xr 
co 

CO 


CN 

xr 


CO 


CD 
CD 
CO 


xT 
LO 

CO 


CN 
LO 


CO 
LO 
I s - 








CO 
CD 
CO 


o 

CO 
CN 




LO 
CN 


CN 
CN 
CO 


Similarity 
(%) 


o 
10 

CO 


45.9 




, 62.6 


60.2 


52.3 


58.2 


73.7 


65.7 


73.8 








71.9 


57.9 




86.0 


58.4 


Identity 
(%) 


32.5 


21.1 




30.8 


CO 

T 

CO 


27.5 


CO 

CN 
CO 


51.4 


36.7 


50.7 








CO 

? 


31.1 




62.3 


31.1 


Homologous gene 


Arabidopsis thaliana CV cnxl 


Serratia marcescens strain IFO- 
3046 prtS 




Mycobacterium tuberculosis 
H37Rv Rv1841c 


Mycobacterium tuberculosis 
H37Rv Rv1842c 


Pseudomonas putida mobA 


Mycobacterium tuberculosis 
H37Rv Rv0438c moeA 


Arabidopsis thaliana cnx2 


Pseudomonas oleovorans 


Micrococcus luteus rho 








Escherichia coli K12RF-1 


Escherichia coli K12 




Mycobacterium tuberculosis 
H37RvRv1301 


Escherichia coli K12 rfe 


db Match 


sp:CNX1_ARATH 


sp:PRTS_SERMA 




sp:Y0D3_MYCTU 


sp:Y0D2_MYCTU 


gp:PPU242952_2 


sp:MOEA_ECOLI 


sp:CNX2_ARATH 


sp:ALKK_PSEOL 


sp:RHO_MICLU 








sp:RF1_ECOLI 


sp:HEMK_ECOLI 




sp:YD01_MYCTU 


sp:RFE_ECOLI 




CO 
CO 

xr 


1866 


CO 
CO 


1008 


1401 


CD 
LO 


1209 


1131 


1725 


2286 


CO 

o 

CO 


CD 
CD 
CD 


1023 


1074 


h- 
CO 
CO 


xr 
r- 
r-- 


CO 

xr 
co 


1146 


Terminal 
(nt) 


1254634 


1254737 


1257750 


1256851 


1257865 


1259429 


1259993 


1261688 


1262886 


1267427 


1266267 ! 


1265611 


1265427 


1268503 


1269343 


1268267 


1270043 


1271192 


Initial 
(nt) 


1254146 


1256602 


1257067 


1257858 


1259265 


1259989 


1261201 


1262818 


1264610 


1265142 


1265665 


1266306 


1266449 


1267430 


1268507 


1269040 


1269396 


1270047 


SEQ 
NO. 
(a.a.) 


4816 


4817 


4818 


4819 


4820 


4821 


4822 


4823 


4824 


4825 


4826 


4827 


4828 


4829 


4830 


4831 


4832 


4833 


SEQ 

NO. 

(DNA) 


CD 
CO 


CO 


CO 
CO 


CD 
CO 


o 

CN 
CO 


r — 
CM 
CO 


CM 
CN 
CO 


CO 
CN 
CO 


xr 

CN 
CO 


LO 
CM 
CO 


CD 
CN 
CO 


h- 
CN 
CO 


CO 
CM 
CO 


Oi 
CN 
CO 


o 

CO 
CO 


CO 
CO 


CN 
CO 
CO 


CO 
CO 
CO 
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Function 




hypothetical protein 


ATP synthase chain a (protein 6) 


H+-transporting ATP synthase lipid- 
binding protein. ATP synthase C 
chane 


H+-transporting ATP synthase chain 
b 


H+-transporting ATP synthase delta 
chain 


H+-transporting ATP synthase alpha 
chain 


H+-transporting ATP synthase 
gamma chain 


H+-transporting ATP synthase beta 
chain 


H+-transporting ATP synthase 
epsilon chain 


hypothetical protein 


hypothetical protein 


putative ATP/GTP-binding protein 


hypothetical protein 


hypothetical protein 


thioredoxin 


Matched 
length 
(aa) 




o 

CO 


LO 
CN 




LO 


h- 

CN 


CO 
LO 


o 

CN 
CO 


CO 
CO 


CM 
CN 


CM 
CO 


o 

CO 
CM 


LO 
CD 


co 


o 


o 

CO 


>* 






































99.0 


56.7 


85.9 


66.9 


67.2 


88.4 


76.6 


100.0 


73.0 


67.4 


85.7 


56.0 


68.7 


79.2 


71.4 


CO 


































Identity 
(%) 




98.0 


24.1 


54.9 


27.8 


34.3 


66.9 


46.3 


99.8 


41.0 


38.6 


70.0 


45.0 


35.8 


54.5 


37.9 


Homologous gene 




Corynebacterium glutamicum 
atpl 


Escherichia coli K12 atpB 


Streptomyces lividans atpL 


Streptomyces lividans atpF 


Streptomyces lividans atpD 


Streptomyces lividans atpA 


Streptomyces lividans atpG 


Corynebacterium glutamicum 
AS019atpB 


Streptomyces lividans atpE 


Mycobacterium tuberculosis 
H37Rv Rv1312 


Mycobacterium tuberculosis 
H37RvRv1321 


Streptomyces coelicolor A3(2) 


Bacillus subtilis yqjC 


Mycobacterium tuberculosis 
H37Rv Rv1898 


Mycobacterium tuberculosis 
H37RvRv1324 


db Match 




GPU:AB046112J 


sp:ATP6_ECOLl 


sp:ATPL_STRLI 


sp:ATPF_STRLi 


sp:ATPD_STRLl 


sp:ATPA_STRLl 


sp:ATPG_STRLI 


sp:ATPB_CORGL 


sp:ATPE_STRLI 


3 
1— 
O 
>- 

5' 

o 
> 
cL 

CO 


3 
1— 
O 
>• 

CD' 
CO 

o 

CL 

CO 


GP:SC26G5_35 


3 
CO 

o 
< 

CO 

I 

o 

a 
> 

CL 
CO 


3 
1— 
O 
> 

o f 

CM 
O 
> 
CL 

CO - 


3 
1— 
O 
>- 

CM 

Q 
>- 

CL 

CO 


LL — 

St 


CD 
CO 


CO 
CN 


o 
— 

CO 


o 

CN 


CD 
LO 


CO 
CO 


1674 


LO 

cr> 


1449 


CN 

co 




o 

CO 
CO 


LO 
CO 
CM 


CO 
LO 
XT 


CM 
CO 


CM 
CD 


Terminal 
(nt) 


1271698 


1272119 


1273149 


1273525 


1274122 


1274943 


1276648 


1277682 


1279136 


1279522 


1280240 


1280959 


1281251 


1281262 


1282105 


1283114 


Initial 
(nt) 


1271213 


1271871 


1272340 


1273286 


1273559 


1274131 


1274975 


1276708 


1277688 


1279151 


1279770 


1280270 


1280967 


1281714 


1281794 


1282194 


SEQ 

NO. 
(a.a.) 


4834 


4835 


4836 


4837 


4838 


4839 


4840 


4841 


4842 ! 


4843 


4844 


4845 


4846 


4847 


4848 


4849 


SEQ 
NO. 
(DNA) 


CO 
CO 


LO 
CO 
CO 


CD 
CO 
CO 


CO 
CO 


CO 
CO 
CO 


CD 
CO 
CO 


o 
co 


co 


CN 
CO 


CO 
CO 


co 


LO 

co 


CO 

co 


h- 
co 


CO 
CO 


CD 

co 
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CD 



X3 



Function 


FMNH2-dependent aliphatic 
sulfonate monooxygenase 


alphatic sulfonates transport 
permease protein 


alphatic sulfonates transport 
permease protein 


sulfonate binding protein precursor 


1,4-alpha-glucan branching enzyme 
(glycogen branching enzyme) 


alpha-amylase 




ferric enterobactin transport ATP- 
binding protein or ABC transport 
ATP-binding protein 


hypothetical protein 


hypothetical protein 




electron transfer flavoprotein beta- 
subunit 


electron transfer flavoprotein alpha 
subunit for various dehydrogenases 




nitrogenase cofactor sythesis protein 




hypothetical protein 


atched 

sngth 

(a.a) 


CD 
CD 
CO 


o 

CN 


CO 
CN 
CN 


CO 


o 


r-- 
co 






T— 
T — 

CN 


o 

CD 
CN 


CO 
CO 




CN 


LO 
CO 
CO 




LO 
CO 




r-- 

CD 
CO 








































Similarity 
(%) 


CO 


CO 


CO 






LO 






CO 


LO 


o 




CO 


CO 








r*- 


T 

h- 


LO 

r^. 


CN 


CN 
CO 


CN 


o 

LO 






CO 


CO 
CD 


o 
r- 




CD 


CD 




CD 




LO 
LO 


Identity 
(%) 


CO 


CO 








CD 






CO 


CD 






CN 






CN 




LO 


o 

LO 


o 


o' 

LO 


LO 
CO 


CD 


CN 
CN 






CO 


CD 
CO 


co 




T — 

CO 


CO 
CO 




LO 

CO 




oS 

CN 






































mid 


Homologous gene 


Escherichia coli K12 ssuD 


Escherichia coli K12 ssuC 


Escherichia coli K12 ssuB 


Escherichia coli K12 ssuA 


Mycobacterium tuberculosis 
H37Rv Rv1326cglgB 


Dictyoglomus thermophilum 
amyC 






Escherichia coli K12 fepC 


Mycobacterium tuberculosis 
H37Rv Rv3040c 


Mycobacterium tuberculosis 
H37Rv Rv3037c 




Rhizobium meliloti fixA 


Rhizobium meliloti fixB 




Azotobactervinelandii nifS 




Rhizobium sp. NGR234 plas 
pNGR234a y4mE 


db Match 


CO 

w' 

CD 
CO 

co 

CM 

O 

o 

LU 


SSUC_ECOLI 


SSUB_ECOLI 


nooa vnss 


—i 
o 
o 

LU 

m ! 
O 
_i 

o 


X 

t- 
o 

Q 

1 

CO 

>- 
< 






FEPC_ECOLI 


:C70860 


:H70859 




:FIXA_RHIME 


:FIXB_RHIME 




IAOZV SdIN: 




:Y4ME_RHISN 




CL 
O) 


sp: 


:ds 


:ds 


sp: 


sp: 






sp: 


i 

Q_ 






sp: 


:ds 




sp; 








1143 


CO 
CO 

r-- 


CD 
CN 

r-- 


LO 
CO 


2193 


1494 


CO 
CO 


CD 
CO 


o 

CO 


1056 


CN 
CD 


CD 
CO 

r^- 


LO 
CD 


LO 
CD 


1128 


CN 
CO 


1146 


Terminal 
(nt) 


1284466 


1285284 


1286030 


1286999 


j 1287281 


1289514 


1291373 


1292577 


1294025 


1295206 


1294436 


1296220 


1297203 


1297093 


1298339 


1298342 


1299000 


Initial 
(nt) 


1283324 


1284517 


1285302 


1286043 


1289473 


1291007 


1291026 


1291699 


1293222 


1294151 


1295047 


1295435 


1296253 


1296479 


1297212 


1298653 


1300145 


SEQ 
NO. 
(a.a.) 


4850 


4851 


4852 


4853 


4854 


4855 


4856 


4857 


4858 


4859 


4860 


4861 


4862 


4863 


4864 


4865 


4866 


SEQ 
NO. 
(DNA) 


o 

LO 
CO 


LO 
CO 


CN 
LO 
CO 


CO 

in 

CO 


LO 

CO 


LO 
LO 
CO 


CO 
LO 
CO 


LO 

CO 


CO 
LO 
CO 


CO 
LO 
CO 


o 

CO 
CO 


CD 
CO 


CN 
CD 
CO 


CO 
CD 
CO 


CO 
CO 


LO 
CD 
CO 


CD 
CO 
CO 



-20 1 - 



Function 


transcriptional regulator 


acetyltransferase 








tRNA (5-methylaminomethyl-2- 
thiouridylate)-methyltransferase 




hypothetical protein 


tetracenomycin C resistance and 
export protin 




DNA ligase 

(polydeoxyribonucleotide synthase 
[NAD+] 


hypothetical protein 


glutamyl-tRNA(Gln) 
amidotransferase subunit C 


glutamyl-tRNA(Gln) 
amidotransferase subunit A 


vibriobactin utilization protein / iron- 
chelator utilization protein 


hypothetical membrane protein 


pyrophosphate-fructose 6- 
phosphate 1-phosphotransrefase 


Matched 
length 
(a.a) 


CO 
LO 


CO 








CD 
CO 


• 


CN 
CO 
CO 


o 
o 

LO 




CD 


o 

CN 
CN 


CO 


co 


CO 
CO 
CN 


CD 
CO 


CO 
LO 
CO 


Similarity 
(%) 


76.3 


55.3 








80.9 




o 

CD 
CO 


65.8 




70.6 


70.9 


64.0 


83.0 


54.0 


79.2 


77.9 


Identity 
(%) 


47.5 


34.8 








61.8 




33.7 


30.2 




42.8 


40^0 


53.0 


74.0 


28.1 


46.9 


54.8 


Homologous gene 


Rhizobium sp. NGR234 plasmid 
pNGR234a Y4mF 


Escherichia coli K12 MG1655 
yhbS 








Mycobacterium tuberculosis 
H37Rv Rv3024c 




Mycobacterium tuberculosis 
,H37Rv Rv3015c 


Streptomyces glaucescens tcmA 




Rhodothermus marinus dnIJ 


Mycobacterium tuberculosis 
H37Rv Rv3013 


Streptomyces coelicolor A3(2) 
gate 


Mycobacterium tuberculosis . 
H37Rv gatA 


Vibrio vulnificus viuB 


Streptomyces coelicolor A3(2) 
SCE6.24 


Amycolatopsis methanolica pfp 


db Match 


z. 

CO 

JZ 
tt. 

1 

LL 

t 
>- 

cL 

CO 


sp:YHBS_ECOLI 








pir:C70858 




pir:B70857 


sp:TCMA_STRGA 




sp:DNLJ_RHOMR 


pir:H70856 


o 
a 

h- 
o' 

< 

CD 

cL 

CO 


sp:GATA_MYCTU 


sp:VIUB_VIBVU 


gp:SCE6_24 


LU 
> 

CL 
LL 
Ol 
Cl 

CO 


ORF 

V D PJ 


LO 
CN 
CN 


o 

LO 


CN 
CO 


1149 


CD 
CO 
CO 


1095 


LO 
CD 


o 

CO 
CO 


1461 


LO 
CO 


2040 


CO 
CD 
CO 


CO 
CN 


1491 


CO 

co 


CO 

o 

CO 


1071 


Terminal 
(nt) 


1300145 


1301055 


1300988 


1301975 


1303694 


1304923 


1303883 


1305921 


1305924 


1307462 


1310369 


1310435 


1311616 


1313115 


1314118 


1314470 


1316083 


Initial 
(nt) 


1300369 


1300552 


1301929 


1303123 


1303299 


1303829 


1304536 


1304932 


1307384 


1308196 


1308330 


1311097 


1311320 


1311625 ' 


1313270 


1314775 


1315013 


SEQ 

NO. 
(a.a.) 


4867 


4868 


4869 


4870 


4871 


4872 


4873 


4874 


4875 


4876 


4877 


4878 


4879 


4880 


4881 


4882 


4883 


SEQ 
NO. 
(DNA) 


co 

CO 


CO 
CO 
CO 


CO 
CO 
CO 


o 

CO 


r-» 
co 


CN 

r- 
co 


CO 

r-- 
co 


CO 


to 
r-- 
co 


CO 

r-- 
co 


CO 


CO 

r- 

CO 


CO 
CO 


o 

CO 
CO 


5 

CO 


CN 
CO 
CO 


CO 
CO 
CO 
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Function 




glucose-resistance amylase 
regulator (catabolite control protein) 


ripose transport ATP-binding protein 


high affinity ribose transport protein 


asmic ribose-binding protein 


high affinity ribose transport protein 


:hetical protein 


iron-siderophore binding lipoprotein 


spendent bile acid transporter 


■dependent amidotransferase B 


ive F420-dependent NADH 
;tase 


thetical protein 


thetical protein 


thetical membrane protein 




roxy-acid dehydratase 


thetical protein 






peripl 


hypot 


Na-di 


RNA- 


putat 
reduc 


hypol 


hypol 


hypol 




dihyd 


hypol 


atched 

sngth 

(aa) 




CO 
CN 
CO 


CO 
CO 


CO 
CN 
CO 


LO 
O 
CO 


CO 
CO 


o 
o 

CN 


LO 

CO 


CO 
CO 
CN 


LO 
CO 


CN 

r-- 


CO 


CO 
CN 


LO 
CN 
CO 




CO 
CO 


LO 

o 










































































Similaril 
(%) 






CN 


CO 






O 


CN 


CO 


CO 




CO 




CD 






CD 




CO 


CO 

r^- 


CO 

r^. 




cb 
CO 


CO 
LO 


O 
CO 


CO 




x — 

CO 


CD 
CD 


CN 
CO 


CN 
LO 




CO 

CO 


GO* 
CD 










CO 


CO 




O 


-^r 


CO 




CO 


CO 


CO 






CN 


CO 


lg 




CO 




LO* 


to 


? 


r — 
CO 


co 


LO 
CO 


CO 


CN 
CO 


CO 
CO 


CO 
CO 


CN 




CO 
CO 


CO 
CO 


Homologous gene 




Bacillus megaterium ccpA 


Escherichia coli K12 rbsA 


Escherichia coli K12 MG1655 
rbsC 


Escherichia coli K12 MG1655 
rbsB 


Escherichia coli K12 MG1655 
rbsD 


Saccharomyces cerevisiae 
YIR042c 


Streptomyces coelicolor 
SCF34.13c 


Rattus norvegicus (Rat) NTCI 


Staphylococcus aureus WHU 29 
ratB 


Methanococcus jannaschii 
MJ1501 f4re 


Escherichia coli K12 yqjG 


Mycobacterium tuberculosis 
H37Rv Rv2972c 


Mycobacterium tuberculosis 
H37Rv Rv3005c 




Corynebacterium glutamicum 
ATCC 13032 ilvD 


Mycobacterium tuberculosis 
H37Rv Rv3004 


db Match 




LU 

< 

<' 

Q_ 

a 
o 


RBSA_ECOLI 


RBSC_ECOLI 


RBSB_ECOLI 


_i 
o 
o 

LU 

1 

Q 
CO 

cn 
cn 


YIW2_YEAST 


SCF34J3 


a: 

i 

o 
i— 
z 


d:W61467 


F4RE_METJA 


YQJG_ECOLI 


:A70672 


:H70855 




:AJ012293_1 


:G70855 






sp. 


:ds 


sp: 


sp: 


:ds 


:ds 


Q_ 
CO 


:ds 


to 

CO 


sp. 


sp: 


L_ 
Q_ 


CL 




D_ 
CO 


L_ 

CL 


ORF 
( D P; 


o 

CO 
CD 


1107 


1572 


CN 
h- 
co 


CN 
CO 


CO 
CO 
CO 


CD 
CO 
CD 


1014 


1005 


1479 


CN 
h- 
CD 


1077 


r^- 


1056 


CO 
CN 


1839 


co 

LO 


Terminal 
(nt) 


1315325 


1317444 


1319005 


1319976 


1320942 


1321320 

i 


1322111 


1323406 


1324537 


1326256 


1327049 


1329891 


1331875 


1333008 


1333188 


1333442 


1335412 


Initial 
(nt) 


1315954 


1316338 


1317434 


1319005 


1320001 


1320952 


1321476 


1322393 


1323533 


1324778 


1326378 


1330967 


1331102 


1331953 


1333424 


1335280 


1335975 


SEQ 

NO. 
(a.a.) 


4884 


4885 


4886 


4887 


4888 


4889 


4890 


4891 


4892 


4893 


4894 


4895 


4896 


4897 


4898 


4899 


4900 


LU § ^ 


CO 
CO 


LO 
CO 
CO 


CO 
CO 
CO 


CO 
CO 


CO 
CO 
CO 


CO 
CO 
CO 


o 

CO 
CO 


r — 

CO 
CO 


CN 
CO 
CO 


CO 
CO 
CO 


CO 
CO 


to 

CO 
CO 


CO 
CO 
CO 


co 

CO 


CO 
CO 
CO 


CO 
CO 
CO 


o 
o 



-20 



3 



_Q 



Function 


hypothetical membrane protein 


hypothetical protein 




nitrate transport ATP-binding potein 


maltose/maltodextrin transport ATP- 
binding protein 


nitrate transporter protein 






actinorhodin polyketide dimerase 


cobalt-zinc-cadimium resistance 
protein 






hypothetical protein 




D-3-phosphoglycerate 
dehydrogenase 


hypothetical serine-rich protein 






hypothetical protein 




Matched 
length 
(aa) 


CN 
CD 


CD 
CD 




I s - 

CD 


co 


CN 
CO 






CN 


o 

CO 






CN 
CO 




o 

CO 

tn 


m 
o 






o 

CN 
CD 




milarity 
(%) 


100.0 


55.0 




80.8 


78.2 


56.8 






73.2 


72.7 






53.7 




100.0 


52.0 






63.1 




CO 










































Identity 
(%) 


100.0 


45.0 




50.9 


46.0 


28.1 






39.4 


39.1 






22.9 




99.8 


29.0 






32.9 




Homologous gene 


Corynebacterium glutamicum 
ATCC 13032 yilV 


Sulfolobus solfataricus 




Synechococcus sp. nrtD 


Enterobacter aerogenes 
(Aerobacter aerogenes) malK 


Anabaena sp. strain PCC 7120 
nrtA 






Streptomyces coelicolor 


Ralstonia eutropha czcD 






Methanococcus jannaschii 




Brevibacterium flavum serA 


Schizosaccharomyces pombe 
SPAC11G7.01 






Rhodobacter capsulatus strain 
SB1003 




db Match 


sp:YILV_CORGL 


GP:SSU18930 26 
3 




sp:NRTD_SYNP7 


sp:MALK_ENTAE 


sp:NRTA_ANASP 






sp;DIM6_STRCO 


sp:CZCD_ALCEU 






sp:Y686_METJA 




gsp:Y22646 


SP:YEN1_SCHPO 






pir:T03476 




ORF 


1473 


r— 
CO 
CN 


CD 

o 

CD 


CO 
CD 


CD 
CN 


CN 

CO 

CO 




CD 
CO 
CO 


CO 
CO 


tn 

CD 


CO 

in 


o 

CD 
CD 


1815 


1743 


1590 


CN 
CO 


CD 
CO 


1062 


1866 


CN 
O 


Terminal 
(nt) 


1336095 


1338379 


1342677 


1341960 


1342461 


1342794 


1344464 


1344808 


1345420 


1346439 


1345335 


1345642 


1348272 


1350076 


1352444 


1351727 


1353451 


1354540 


1357554 


1356853 


Initial 
(nt) 


1337567 


1338609 


1342072 


1342457 


1342727 


1343675 


1344018 


: 1344440 


1344935 


1345486 


1345487 


1346331 


1346458 


1348334 


1350855 


1352053 


1352585 


1355601 


1355689 


1356452 


SEQ 

NO. 
(a.a.) 


4901 


4902 


4903 


4904 


4905 


4906 


4907 


4908 


4909 


4910 


4911 


4912 


4913 


4914 


4915 


4916 


4917 


4918 


4919 


4920 


SEQ 

NO. 
(DNA) 


o 


CN 
O 


CO 

o 


xr. 
o 


in 
o 
^- 


CD 
O 


o 


CO 

o 


CD 
O 


o 

5- 




CN 


CO 




in 


CD 


5- 


CO 

5- 


CD 
5 


o 

CN 
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g 
c 

Z3 



CO 
O 



.<2 3-0 



CO 

o — 
O CO 



o « 

E § 

o S= 



■ ■ 1 w 

a> <*> JS 
co c 2 

CO Q) <p 

I*! 

<0 CO 2 
CD Q.-C 

"a3 -E o> 

CO ^ CO 

ra o 2 

CD -5 CD 

o -9 o 

CO 



c 

CD 
1 

CO ^ 

6 o> 

5? co 
o i° 



E 8 £ 

•e I S 

,S2 04 ~ 0-0-0 



CO O) 

cd o 

CO r- CO 

CO - 5 co 

O CT^CD 

*cn lo *wi 

c 3 c 

-£5 JZ 

a> g Q> 

E -0 E 



a) 
co 
co 

"55 
j= 

"c 

>s 
CO 

< 

a: 

>» 
E 

(0 

-3 



CO 

c 
o 



CD 



c 

CO 

o 
!a 

*E 

CO 



CO CD "*— -* 

2 - 



00 
CM 
CN 



CN 

cd 



m 

co 

^3" 



CD 



cd 
a> 
10 



co 
CO 



O) 

in 



in 
in 



o 

O) 



o 

T — 

CO 



-a 



co 



co 

CN 



co 



r-- 



LO 
CD 



CD 
3 



-O 
.CO 



CO 

< 



CD 



O 

cn 
_o 
o 
E 
o 
X 



LU 



cd 
o 

CO 

LU 



CN 

"o 
u 

CO 



LU 



O 



o 

CO 
CD 



a 



o 

< 



E 
o 



CO 



o 
o 

LU 

I 

LU 

O 

a. 
x 



o 
a 

LU 

m 
=> 



CO 

o 
< 

m 
> 

CO 



co 

CO 



0 

CO 



St 



o 
co 



co 
5 



CO 

00 



CD 

10 



CM 
CN 

in 



CM 

co 



CO 

o 

CO 



o 

CO 



o 

CO 
CO 



CO 
CO 



CM 

m 



CM 
CO 



a 
co 
CO 



CO 

o 
< 

CD 

o 1 

X 



CO 



£ 3 

H 



CM 
CO 

o 

CD 

in 
co 



a> 
co 

CO 
CD 

in 
co 



co 

CD 

o 

CO 
CO 



CO 
CM 
CD 
CO 



CO 
CM 
CO 
CN 
CO 
CO 



CN 
CO 

r-- 
co 

CO 
CO 



CO 

in 

CM 

in 
co 
co 



co 
co 
co 

CO 



CM 

in 
co 
co 



co 

CO 
CO 
CO 



m 
o 
in 
r^- 
co 
co 



co 

CO 
CO 

r- 
co 
co 



in 

CD 
CO 
CO 
CO 
CO 



CO 
CD 
CD 
CO 



co 

CD 

co 
co 



co 
'E 



CD 

m 

CM 
CO 

m 

co 



CM 

m 
o 

CD 

in 
co 



in 
01 

CM 

CO 
CO 



CO 
CO 

CO 
CO 
CO 



m 

CO 
CO 
CO 
CO 



CO 

in 

CM 

CO 
CO 



o 

CO 
CD 

CD 
CO 



CO 
CD 
CO 

m 

CO 
CO 



CO 

o 

CO 

in 

CO 
CO 



CO 
CD 
CN 

co 

co 



o 
r- 
o 
00 
co 

CO 



o 
o 

CO 

CO 
CO 



r-- 
co 
co 

co 



2° ? 

CO z « 



CM 
CM 
CO 



co 

CM 
CD 



m 

CM 
CD 



CO 
CN 
CD 
T 



CN 
CD 



CO 
CM 
CD 



o 

CO 
CD 



CN 
CO 
CD 



co 

CO 
CD 



CO 
CD 



m 
co 

CD 



co 

CO 
CD 



co 

CD 



CD 
CO 
CD 



O rS < 
LU O 2 

9, 



CM 
CN 



co 

CM 



in 

CM 



co 

CM 



CM 



CO 
CM 



o 
co 



co 



co 
co 



in 
co 



co 
co 



co 



CO 
CO 
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■o 

CD 



Function 






lipoprotein 




glycogen phosphorylase 






hypothetical protein 


hypothetical membrane protein 




guanosine 3',5'-bis(diphosphate) 3- 
pyrophosphatase 


acetate repressor protein 


3-isopropylmalate dehydratase large 
subunit 


3-isopropylmalate dehydratase small 
subunit 




mutator mutT protein ((7,8-dihydro 
8-oxoguanine-triphosphatase)(8- 
oxo-dGTPase)(dGTP 
pyrophosphohydrolase) 




NAD(P)H-dependent 
dihydroxyacetone phosphate 
reductase 


D-alanine-D-alanine ligase 


Matched 
length 
(aa) 






xr 
xr 




i^- 

o 

r^- 






CD 

o> 

CN 


CD 

m 

CN 




CO 


in 

CN 


CO 

xr 


m 

CD 




XT 
CD 
CN 




CO 
CO 


xr 

CO 


Similarity 
(%) 






74.0 




74.0 






52.8 


64.8 




60.1 


r-- 
o 

CO 


87.5 


89.2 




71.4 




72.2 


67.4 


Identity 
(%) 






61,0 




44.2 






25.4 


25.4 




29.8 


26.1 


68.1 


67.7 




45.9 




45.0 


40.4 


Homologous gene 






Chlamydia trachomatis 




Rattus norvegicus (Rat) 






Bacillus subtilis yrkH 


Methanococcus jannaschii Y441 




Escherichia coli K12 spoT 


Escherichia coli K12 icIR 


Actinoplanes teichomyceticus 
Ieu2 | 


i 

Salmonella typhimurium 




Mycobacterium tuberculosis 
H37RvMLCB637.35c 




Bacillus subtilis gpdA 


Escherichia coli K12 MG1655 
ddIA 


db Match 






GSP:Y37857 




sp:PHS1_RAT 






sp:YRKH_BACSU 


sp:Y441_METJA 




sp:SPOT_ECOLI 


sp:ICLR_ECOLI 


sp:LEU2_ACTTI 


sp:LEUD_SALTY 




gp:MLCB637_35 




sp:GPDA_BACSU 


sp:DDLA_ECOLI 


§t 


CO 

xr 

CO 


co 
in 


CN 
CO 

T — 


CD 
CO 
CD 


2427 


co 

CO 


CO 

m 


1407 


o 
in 


XT 


xr 

CD 

in 


LO 

o 
r- 


1443 


CD 

in 


CO 

CO 


xr 
m 

CD 


co 

LO 


CD 
CD 
CD 


1080 


Terminal 
(nt) 


1371979 


1373131 


1373929 


1375491 


1373350 


1375805 


1375933 


1376149 


1377666 


1378466 


1379566 


1379555 


1381882 


1382492 


1382502 ! 


1382845 


1384085 


1385125 


1386232 


Initial 
(nt) 


1372326 


1372601 


1373798 


1374556 


1375776 


1375987 


1376088 


1377555 


1378415 


1378942 


1379003 


1380259 


1380440 


1381902 


1382819 


1383798 


1383930 


1384130 


1385153 


SEQ 
NO. 
(a.a.) 


4940 


4941 


4942 


4943 


4944 


4945 


4946 


4947 


4948 


4949 


4950 


4951 


4952 


4953 


4954 




4956 


4957 


4958 


SEQ 
NO. 
(DNA) 


o 
xr 
xr 


xr 

XT 


CN 
XT 

xr 


CO 

xr 


xr 
xr 
xr 


in 
xr 
xr 


CD 

xr 


xr 


CO 
XT 

xr 


CD 

xr 
xr 


o 
in 
xr 


LO 

xr 


CN 

LO 
T 


CO 

in 
xr 


xr 
m 
xr 


m 
in 
xr 


CD 

in 
xr 


r- 
in 

xr 


CO 
LO 
XT 



-20 



6 - 



if* 



T3 

d> 
c 



05 



c 
o 

c 



X3 

CD -C ^ 
CO CD " — ' 



to Co" 

I - 

CO 



CD 
CD 



o 

CD 

_o 

o 

E 
o 

X 



E 
E : 
(1) 



co ^ 9. 



c 
o 

75 



o 

Cl 



oo 

CD 
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CO 



CO 
CM 
CN 
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Function 












insertion element (IS3 related) 




hypothetical protein 




















DNA polymerase I 


cephamycin export protein 


DNA-binding protein 


morphine-6-dehydrogenase 




Matched 
length 












CD 
CN 




co 




















CO 
CD 
CO 


CD 
LO 


CO 
CO 
CN 


CO 

CN 




milarity 
(%) 












CN 




o 




















CO 


CO 


^J" 
















CD 
CD 




CD 




















o 

CO 


CO 


LO 
CO 


CD 

r-- 




CO 


























































LO 




O 




















CO 


CO 


CO 


LO 
















CO 

CO 




CD 
CO 




















CD 
LO 


CO 
CO 




CO 




Homologous gene 












Corynebacterium glutamicum 
orf2 




Corynebacterium glutamicum 




















Mycobacterium tuberculosis 
polA 


Streptomyces lactamdurans 
cmcT 


Streptomyces coeticolor A3(2) 
SCJ9A.15c 


Pseudomonas putida morA 




db Match 












S60890 




PIR:S60890 




















DP01_MYCTU 


CMCT_NOCLA 


:SCJ9AJ5 


:MORA_PSEPU 
















Q_ 






















en* 

sp. 


:ds 


Q_ 
CD 


:ds 




8l 


XT 


CN 
CO 


o 

LO 


CD 
CO 


CD 
CN 


CN 
CD 


LO 
LO 

CO 


\ — 


CD 
CD 
CO 


LO 
CO 


CN 
CO 


LO 
CO 


co 

CD 


CO 

o 

CO 


CD 

to 


CN 
CN 
CN 


CD 
CN 


2715 


1422 


CD 
O 
CD 


CO 
CO 


CD 
LO 


Terminal 
(nt) 


1402076 


1402703 


1402368 


1403991 


1404215 


1404694 


1405320 


1406999 


1407167 


1407559 


1408703 


1409428 


1410064 


1411119 


1411437 1 


1412572 


1412626 


1416459 


1416462 


1418870 


1419748 


1419878 


Initial 
(nt) 


1401333 


1402272 


1402874 


1403128 


1403997 


1404885 


1406174 


1407109 


1407535 


1407873 


1409023 


1409802 


1411011 


1411424 


1412000 1 


1412351 


1412916 


1413745 


1417883 


1417962 


1418876 


1420036 


SEQ 
NO. 
(a.a.) 


4977 


4978 


4979 


4980 


4981 


4982 


4983 


4984 


4985 


4986 


4987 


4988 


4989 


4990 


4991 


4992 


4993 


4994 


4995 


4996 


4997 


4998 


° ri % 

lu y 2 


r-- 
xr 


CO 


CD 
XT 


o 

CO 


T — 

CO 


CN 
CO 


CO 
CO 


xr 

CO 

xr 


to 

CO 


CD 
CO 
TT 


CO 


CO 
CO 


CD 
GO 


o 

CD 


CD 


CN 
CD 


CO 
CD 


xr 

CD 


LO 
CD 


CO 
CD 


CD 


CO 
CD 
XT 
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JO 
,00 



Function 


hypothetical protein 


30S ribosomal protein S1 




hypothetical protein 










inosine-uridine preferring nucleoside 
hypolase (purine nucleosidase) 


aniseptic resistance protein 


ribose kinase 


criptic asc operon repressor, 
ranscription regulator 




excinuclease ABC subunit B 


hypothetical protein 


hypothetical protein 


hypothetical protein 




hypothetical protein 


hypothetical protein 


hydrolase 


Matched 
length 
(aa) 


CO 
CD 


in 

^ 




LO 

cn 










o 

CO 


LO 


CO 

cn 

CN 


h- 
co 

CO 




CO 


CN 
LO 


CN 


cn 
r-- 

CN 




cn 

CO 
CO 


o 
in 


-r— 
CM 














































Similaril 
(%) 


58.3 


71.4 




93.9 










81.0 


53.8 


67.6 


65.6 




83.3 


59.2 


80.2 


77.1 




47.2 


68.0 


58.4 


Identity 
(%) 


31.9 


39.5 




80.5 










61.9 


23.6 


35.5 


30.0 




57.4 


33.6 


38.8 


53.8 




23.2 


32.7 


30.4 


Homologous gene 


Streptomyces coelicolor 
SCH5.13yafE 


Escherichia coli K12 rpsA 




Brevibacterium lactofermentum 
ATCC 13869 yacE 










i Crithidia fasciculata iunH 


Staphylococcus aureus 


Escherichia coli K12 rbsK 


Escherichia coli K12 ascG 




Streptococcus pneumoniae 
plasmid pSB470 uvrB 


Methanococcus jannaschii 
MJ0531 


Escherichia coli K12 ytfH 


Escherichia coli K12ytfG 




Bacillus subtilis yvgS 


Streptomyces coelicolor A3(2) 
SC9H11.26C 


Escherichia coli K12 ycbL 


db Match 


sp:YAFE_ECOLI 


sp:RS1_ECOLI 




sp:YACE_BRELA 










sp:IUNH_CR!FA 


sp:QACA_STAAU 


sp:RBSK_ECOLI 


_i 
o 
o 

LU 

o' 

(J 

to 

< 

CL 
CO 




sp:UVRB STRPN 


< 
— > 
\- 

UJ 

CO 

in 
> 
cL 

CO 


i 

o 
o 

LL 
1— 
> 
CL 
CO 


_i 
o 
o 

LU 

o' 

U_ 
1— 

>- 

cL 

CO 




pir:H70040 


gp:SC9H11_26 


sp:YCBL_ECOLl 


ORF 


LO 

CD 


1458 


1476 


O 

o 

CO 


1098 


CN 
CO 
LO 


CO 
CN 


LO 

cn 


CO 
CO 

cn 


1449 


t — 

CN 

cn 


1038 


CO 

cn 
h- 


2097 


5- 


T — 

CO 
CO 


CO 
CO 


CO 
CO 


2349 


CN 

5) 


o 
o 

CD 


Terminal 
(nt) 


1420071 


1422556 


1421096 


1425878 


1427354 


1427376 


1427804 


1429246 


1428224 

i 


1429194 


1430659 


1431575 


1433547 


1436201 


1436775 


1436869 


1438201 


1440026 


1438212 


1440675 


1441793 


Initial 
(nt) 


1420724 


1421099 


1422571 


1425279 


1426257 


1427957 


1428049 


1428290 


1429159 


, 1430642 


1431579 


1432612 


1432750 


1434105 


1436335 


1437249 


1437356 


1439343 


1440560 


1441586 


1442392 


SEQ 
NO. 
(a.a.) 


4999 


5000 


5001 


5002 


5003 


5004 


5005 


5006 


5007 


I 5008 


5009 


5010 


5011 


5012 


5013 


5014 


5015 


5016 


5017 


5018 


5019 


SEQ 
NO. 
(DNA) 


cn 
cn 


o 
o 
in 


o 
in 


CN 
O 
LO 


CO 

o 
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LO 


LO 
O 
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LO 


CO 
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lO 
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o 

LO 
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LO 
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LO 


CO 

in 
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m 
in 


CO 

in 
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in 


CO 
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in 


cn 

LO 
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Function 


excinuclease ABC subunit A 


hypothetical protein 1246 (uvrA 
region) 


hypothetical protein 1246 (uvrA 
region) 






translation initiation factor IF-3 


SOS ribosomal protein L35 


50S ribosomal protein L20 






sn-glycerol-3-phosphate transport 
system permease protein 


sn-glycerol-3-phosphate transport 
system protein 


sn-glycerol-3-phosphate transport 
system permease proein 


sn-glycerol-3-phosphate transport 
ATP-binding protein 


hypothetical protein 


glycerophosphoryl diester 
phosphodiesterase 


tRNA(guanosine-2'-0-)- 
methlytransferase 


phenylalanyl-tRNA synthetase alpha 
chain 


T3 

<D JC 

CO CD — - 


CN 

tn 

CO 


o 
o 
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CO 
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Similarity 
(%) 


CD 


o 


O 






CN 
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CO 
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o 
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CO 
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CD 
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CD 
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CN 
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in 




o 






CN 


co 


co 


o 


o 


CN 


o 






CD 
LO 


o 


T — 

CO 






CN 

m 




ih 






CO 
CO 


CO* 
CO 


CO 
CN 






CO* 
CN 


CO 




Homologous gene 


Escherichia coli K12 uvrA 


Micrococcus luteus 


Micrococcus luteus 






Rhodobacter sphaeroides infC 


Mycoplasma fermentans 


Pseudomonas syringae pv. 
syringae 






Escherichia coli K12 MG1655 
ugpA i 


Escherichia coli K12MG1655 
upgE 


Escherichia coli K12 MG1655 
ugpB 


Escherichia coli K12 MG1655 
ugpC 


Aeropyrum pernix K1 APE0042 


Bacillus subtilis glpQ 


Escherichia coli K12 MG1655 
trmH 


Bacillus subtilis 168 syfA 


db Match 


UVRA_ECOLI 


PIR:JQ0406 


PIR:JQ0406 






IF3_RHOSH 


RL35_MYCFE 


RL20_PSESY 






_i 
o 
o 

LU 

<' 
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O 
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;UGPE_ECOLI 


:UGPB_ECOLI 


:UGPC_ECOLI 


*:E72756 


ZD 
CO 

o 
< 
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CL 
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P. 


:TRMH_ECOLI 


:SYFA_BACSU 
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:ds 


sp: 


sp: 
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:ds 


:ds 


ds 


Q. 


cL 

CO 


Q_ 

to 
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(DP) 
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CD 
O 
CO 
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CD 
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1314 
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CN 




CD 
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Terminal 
(nt) 


1445333 


1443810 


1444944 


1446874 


1445323 


1448358 


1448581 


1449025 


1449119 


1450692 


1451820 


1452653 


1454071 


1455338 


1454102 


1455350 


1456948 


1458066 


Initial 
(nt) 


1442487 


1444115 


1445393 


1446158 


1447446 


1447792 


1448390 


1448645 


1449940 


1450126 


1450918 


1451820 


1452758 


1454115 


1454350 


1456066 


1456355 


1457047 
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<D 

o 
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>^ 
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co 
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CO 
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LO 
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co 
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u_ 

a; 
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LO 
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o 
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CO 
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CO 



LO 
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co 
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co 
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CN 

co 
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CO 



LO 

co 

00 
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o 
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O 
LO 



CO 
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O 
LO 



LO 
LO 



LO 
LO 
LO 



CD 
LO 
LO 
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s „ 0) 



Function 


hypothetical protein 


translation initiation factor IF-2 


hypothetical protein 




hypothetical protein 


hypothetical protein 


DNA repair protein 


hypothetical protein 


hypothetical protein 


CTP synthase (UTP-ammonia 
ligase) 


hypothetical protein 


tyrosine recombinase 


tyrosin resistance ATP-binding 
protein 


chromosome partitioning protein or 
ATPase involved in active 
partitioning of diverse bacterial 
plasmids 


hypothetical protein 




thiosulfate sulfurtransferase 


hypothetical protein 


ribosomal large subunit 
pseudouridine synthase B 


Matched 
length 
(aa) 


CO 


CM 

CO 


CO 




o 

CO 
CM 


LO 

CM 
CM 


h- 

LO 


CO 

CO 


CO 
CO 


CO 
LO 


LO 


o 
o 

CO 


LO 
LO 


CO 
LO 
CN 


LO 
CN 




o 

CM 


CM 


CO 
CN 
CN 


Similarity 
(%) 


66.0 


67.0 


60.1 




CO 
CO 
CO 


31.6 


63.4 


73.1 


68.1 


76.7 


71.3 


71.7 


59.7 


73.6 


64.5 




67.0 


65.7 


72.5 


Identity 
(%) 


61.0 


36.3 


CO 
CO 
CM 




38.5 


31.6 


31.4 


41.9 


o 

CO 


55.0 


36.3 


39.7 


30.5 


44.6 


28.3 




CD 

LO 
CO 


33.1 


45.9 


Homologous gene 


Chlamydia pneumoniae 


Borrelia burgdorferi IF2 


Bacillus subtilis yzgD 




Bacillus subtilis yqxC 


Mycobacterium tuberculosis 
H37Rv Rv1695 


Escherichia coii K12 recN 


Mycobacterium tuberculosis 
H37Rv Rv1697 


Mycobacterium tuberculosis 
H37Rv Rv1698 


Escherichia coii K12 pyrG 


Bacillus subtilis yqkG 


Staphylococcus aureus xerD 


Streptomyces fradiae tlrC 


Caulobacter crescentus parA 


Bacillus subtilis ypuG 




Datisca glomerata tst 


Bacillus subtilis ypuH 


Bacillus subtilis rluB 


db Match 


GSP:Y35814 


sp:IF2_BORBU 


sp:YZGD_BACSU 




sp:YQXC_BACSU 


LU 

< 
X 

I 

CD 
—> 
LL 
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CL 
CO 


sp:RECN_ECOLI 


pir:H70502 


pir:A70503 


sp:PYRG_ECOLI 


sp:YQKG_BACSU 


gp:AF093548_1 


cn 

LL 
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i- 
co 

1 
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cn 
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gp:CCU87804_4 


sp:YPUG_BACSU 




gp:AF109156_1 
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co 
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CO 
LO 


CD 
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(nt) 


1483724 


1486027 


1487025 


1487193 


1488056 


1489018 


,1490881 


1492134 


1493109 


1495174 


1495861 


1496772 


1496795 


1499645 


1500695 


1500911 


1502576 


1503176 


1504238 
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(nt) 


1483996 


1484675 


1486042 


1487032 


1487238 


1488146 


1489103 


1490944 


1492147 


1493513 


1495205 


1495861 


1498324 


1498863 


1499931 


1501471 


1501710 


1502634 


1503483 
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Function 




phosphomethylpyrimidine kinase 


hydoxyethylthiazole kinase 


cyclopropane-fatty-acyl-phospholipid 
synthase 


sugar transporter or 4-methyl-o- 
phthalate/phthalate permease 


purine phosphoribosyltransferase 


hypothetical protein 


arsenic oxyanion-translocation pump 
membrane subunit 




hypothetical protein 


sulfate permease 


hypothetical protein 










hypothetical protein 


dolichol phosphate mannose 
synthase 


apolipoprotein N-acyltransferase 




secretory lipase 


Matched 
length 
(aa) 




CN 
CO 
CN 


CD 
CN 


to 


CO 
CO 


CO 
LO 


CO 

o 

CN 


CD 
CO 




CN 
CN 
CN 


CD 
CO 


^- 

CD 










o 


CN 


CN 
LO 




CN 
CD 
CO 


Similarity 
(%) 




70.2 


77.5 


55.0 


66.9 


59.0 


| 68.5 


54.6 




83.8 


83.6 


50.0 










87,3 


71.0 


55.6 




55.6 


Identity 
(%) 




47.3 


CD 
CD 


28.6 


32.5 


36.5 


I 39.8 


23.3 




62.2 


51.8 


39.0 










71.8 


39.2 


25.1 




23.7 


Homologous gene 




Salmonella typhimurium thiD 


Salmonella typhimurium LT2 
thiM 


Mycobacterium tuberculosis 
H37Rv ufaA1 


Burkholderia cepacia Pc701 
mopB 


Therm us flavus AT-62 gpt 


Escherichia coli K12 yebN 


Sinorhizobium sp. As4 arsB 




Streptomyces coelicolor A3(2) 
SCI7.33 


Pseudomonas sp. R9 ORFA 


Pseudomonas sp. R9 ORFG 










Mycobacterium tuberculosis 
H37Rv Rv2050 


Schizosaccharomyces pombe 
dpml 


Escherichia coti K12 Int 




Candida albicans Iip1 


db Match 




sp:THID_SALTY 


sp:THlM_SALTY 


pir:H70830 


prf:2223339B 


prf:2120352B 


sp:YEBN_ECOLI 


gp:AF178758_2 




gp:SCI7_33 


gp:PSTRTETC1_6 


GP:PSTRTETC1_7j 










pir:A70945 


prf:2317468A 


sp:LNT_ECOLI 




gp:AF188894_1 




CN 
O 
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o 

CO 


1314 


1386 


h- 


CD 
CD 
CO 


CO 
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CD 


CO 
CO 


CO 
CD 
CO 


1455 


CO 
CN 
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LO 
CO 


o 

CN 


CD 
CO 

T — 


o 

LO 


CD 
CD 
CO 


o 

CO 


1635 


r— 


1224 


Terminal 
(nt) 


1538963 


1539820 


1542119 


1546289 


1546307 


1547967 


1549349 


1550398 


1550951 


1552237 


1553972 


1553297 


1554070 


1555067 


1554891 


1555086 


1556771 


1557014 


1557859 


1559497 


1560437 


Initial 
(nt) 


1539664 


1541403 


1542922 


1544976 


1547692 


1548440 


1548651 


1549403 


1550469 


1551545 


1552518 


1553722 


1554684 


1554861 


1555079 


1555835 


1556376 


1557823 


1559493 


1560237 


1561660 
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NO. 
(a.a.) 


5116 


5117 


5118 


5119 


5120 


5121 


5122 


5123 


5124 


5125 


5126 


5127 


5128 


5129 


5130 
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5132 


5133 


5134 


5135 


5136 
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CO 
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CO 
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CD 
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CO 
CO 


CO 
CO 
CO 



-2 1 



5 - 



c: 



,C0 



Function 


precorrin 2 methyltransf erase 


precorrin-6Y C5, 15- 
methyltransferase 






oxidoreductase 


dipeptidase or X-Pro dipeptidase 




ATP-dependent RNA helicase 


sec-independent protein translocase 
protein 


hypothetical protein 


hypothetical protein 


hypothetical protein 


hypothetical protein 




hypothetical protein 


hypothetical protein 


hypothetical protein 


Matched 
length 
(a-a) 


cn 

CN 


? 






CN 


CN 
CO 
CO 




1030 


CO 
CO 
CN 


m 

CO 


co 


CN 
CO 


co 




CD 


CD 

in 


CD 

in 


milarity 
(%) 


56.7 


60.8 






75.4 


61.3 




LO 

in 


CN 
CD 


69.4 


61.2 


64.8 


77.3 




80.3 


74.2 


50.0 


CO 




































Identity 
(%) 


31.3 


32.4 






54.1 


36.1 




26.5 


28.7 


44.7 


31.9 


32.4 


53.1 




54.1 


48.6 


42.0 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv cobG 


Pseudomonas denitrificans 
SC510cobL 






Mycobacterium tuberculosis 
H37Rv RV3412 


Streptococcus mutans LT1 1 
pepQ 




Saccharomyces cerevisiae 
YJL050W dob1 


Escherichia coli K12tatC 


Mycobacterium leprae 
MLCB2533.27 


Mycobacterium tuberculosis 
H37Rv Rv2095c 


Mycobacterium leprae 
MLCB2533.25 


Mycobacterium tuberculosis 
H37RvRv2097c 




Mycobacterium tuberculosis 
H37Rv Rv2111c 


Mycobacterium tuberculosis 
H37Rv Rv2112c 


Aeropyrum pernix K1 APE2014 


db Match 


pir:C70764 


sp:COBL_PSEDE 






sp:YY12_MYCTU 


gp:AF014460_1 




sp:MTR4_YEAST 


sp:TATC_ECOLI 


sp:YY34_MYCLE 


sp:YY35_MYCTU 


sp:YY36_MYCLE 


sp:YY37_MYCTU 




pir:B70512 


pir:C70512 


PIR:H72504 


ORF 
(bp) 
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CD 
CO 
CO 


CD 
CN 


CO 
CO 
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CO 
CD 


2787 


1002 


m 
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CD 


CN 
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CN 


CN 
O) 
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CO 


Terminal 
(nt) 


1562553 


1562525 


1564237 


1564482 


1564565 


1565302 


1567106 


1567117 


1569932 


1571068 


1571506 


1572492 


1573491 


1575205 


1574945 


1575406 


1577806 


Initial 
(nt) 


1561780 


1563802 


1563872 


1564237 


1565302 


1566438 


1566468 


1569903 


1570933 


1571382 


1572486 


1573463 


1574915 


1574957 


1575136 


1576947 


1577327 
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(a.a.) 
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CN 

in 

CD 


CO 

in 

CD 



Function 


AAA family ATPase (chaperone-like 
function) 


protein-beta-aspartate 
methyltransferase 


aspartyl aminopeptidase 


hypothetical protein 


virulence-associated protein 


quinolon resistance protein 


aspartate ammonia-lyase 


ATP phosphoribosyltransferase 


beta-phosphoglucomutase 


5-methyltetrahydrofolate- 
homocysteine methyltransferase 




alkyl hydroperoxide reductase 
subunit F 


arsenical-resistance protein 


arsenate reductase 


arsenate reductase 




cysteinyl-tRNA synthetase 


Matched 
length 
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m 
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CN 
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CO 
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CO 
CO 


CO 
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CO 
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CD 
CO 
CO 


CO 
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CO 


CO 
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co 

CO 


Similarity 
(%) 


78.5 


79.0 


67.2 


71.4 


72.5 


61.0 


99.8 


LO 
CO 


63.1 


CN 
CO 




49.5 


63.9 


64.3 


75.6 




64.3 


Identity 
(%) 


51.6 


57.3 


38.1 


45.4 


40.6 


21.8 


99.8 


96.8 


30.8 


31.6 




22.4 


33.0 


32.6 


47.2 




35.9 


Homologous gene 


Rhodococcus erythropolis arc 


Mycobacterium leprae pimT 


Homo sapiens 


Mycobacterium tuberculosis 
H37Rv Rv2119 


Dichelobacter nodosus A198 
vapl 


Staphylococcus aureus norA23 


Corynebacterium glutamicum 
(Brevibacterium flavum) MJ233 
aspA 


Corynebacterium glutamicum 
AS019hisG 


Thermotoga maritima MSB8 
TM1254 


Escherichia coli K12 metH 




Xanthomonas campestris ahpF 


Saccharomyces cerevisiae 
S288C YPR201W acr3 


Staphylococcus aureus plasmid 
p!258 arsC 


Mycobacterium tuberculosis 
H37Rv arsC 




Escherichia coli K12 cysS 


db Match 


prf:2422382Q 


pir:S72844 


gp:AF005050_1 


pir:B70513 


sp:VAPI_BACNO 


prf:2513299A 


sp:ASPA_CORGL 


gp:AF050166J 


pir:H72277 


sp:METH_ECOLI 




sp:AHPF_XANCH 


sp:ACR3_YEAST 


sp:ARSC_STAAU 


pir:G70964 
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1582114 
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1585603 


1586812 
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1591912 
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1594951 


1595668 


1595844 


1596249 
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1585490 
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1595030 


1596221 
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hypothetical protein 


nitrogen fixation protein 


ABC transporter ATP-binding protein 


hypothetical protein 


ABC transporter 


DNA-binding protein 
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Homologous gene 


Aeropyrum pernix K1 APE2025 


Mycobacterium leprae nifS 


Streptomyces coelicolor A3(2) 
SCC22.04c 


Mycobacterium tuberculosis 
H37Rv Rv1462 


Synechocystis sp. PCC6803 
slr0074 


Streptomyces coelicolor A3(2) 
SCC22.08c 


Mycobacterium tuberculosis 
H37Rv Rv1459c 


Mycobacterium leprae 
MLCL536.31 abc2 


Mycobacterium leprae 
MLCL536.32 


Mycobacterium tuberculosis 
H37Rv Rv1456c 




Pyrococcus horikoshii PH0450 


Escherichia coli K12 qor 


Nitrobacter winogradskyi coxC 


Corynebacterium glutamicum 
ATCC 31833 tkt 
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MLCL536.39 tal 
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Mycobacterium tuberculosis 
H37Rv Rv1446c opcA 


Saccharomyces cerevisiae 
S288C YHR163Wsol3 


Bacillus sp. NS-129 


Rhodococcus erythropolis 


Corynebacterium glutamicum 
ATCC 13032 soxA 








Corynebacterium glutamicum 
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Saccharomyces cerevisiae 
YCR013C 


Corynebacterium glutamicum 
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CD 



CD 



Function 


bacterial regulatory protein, arsR 
family 


ABC transporter 




iron(lll) ABC transporter, 
periplasmic-binding protein 


ferrichrome transport ATP-binding 
protein 


shikimate 5-dehydrogenase 


hypothetical protein 


hypothetical protein 


alanyl-tRNA synthetase 


hypothetical protein 




aspartyl-tRNA synthetase 


hypothetical protein 


glucan 1,4-alpha-glucosidase 


phage infection protein 




transcriptional regulator 


Matched 
length 
(aa) 


CO 
CO 


O 

^- 

co 




CO 

h- 
co 


o 

CO 
CN 


cn 

LO 
CN 


LO 

cn 

CO 


CO 


cn 

CO 


LO 

^- 




cn 

LO 


cn 

CN 


cn 

CO 
CO 


CN 




CN 

cn 


milarity 
(%) 


68.7 


73.2 




50.7 


71.7 


60.0 


70.1 


69.6 


71.8 


84.8 




89.2 


74.1 


53.6 


54.0 




62.0 


to 




































Identity 
(%) 


CO 
LO 


35.9 




23.6 


38.3 


50.0 


41.8 


CO 

CN 
LO 


43.3 


65.4 




71.1 


46.1 


26.1 


23.1 




29.2 


Homologous gene 


Streptomyces coelicolor A3(2) 
SC1A2.22 


Corynebacterium diphtheriae 
hmuU 




Pyrococcus abyssi Orsay 
PAB0349 


Bacillus subtilis 168 fhuC 


Mycobacterium tuberculosis 
H37Rv aroE 


Mycobacterium tuberculosis 
H37Rv Rv2553c 


Mycobacterium tuberculosis 
H37Rv Rv2554c 


Thiobacillus ferrooxidans ATCC 
33020 alaS 


Mycobacterium tuberculosis 
H37Rv Rv2559c 




Mycobacterium leprae aspS 


Mycobacterium tuberculosis 
H37Rv Rv2575 


Saccharomyces cerevisiae 
S288C YIR019C stal 


Bacillus subtilis yhgE 




Streptomyces coelicolor A3(2) 
SCE68.13 


db Match 


gp:SC1A2_22 


gp:AF109162_2 




pir:A75169 


sp:FHUC_BACSU 


pir:D70660 


pir:E70660 


pir:F70660 


sp:SYA_JHIFE 


sp:Y0A9_MYCTU 




sp:SYD_MYCLE 


sp:Y0BQ_MYCTU 


sp:AMYH_YEAST 


sp:YHGE_BACSU 




gp:SCE68J3 


u 
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CO 
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co 
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CO 
CN 
CO 


1167 


CD 
LO 


2664 


1377 


1224 
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cn 

CO 


2676 


1857 


CO 
CD 


cn 

LO 


Terminal 
(nt) 


1721423 


1722853 


1722202 


1723826 


1724578 


1724612 


1725459 


1726625 


1727385 


1730166 


1731599 


1732988 


1735946 


1736004 


1738713 


1740572 


1741906 


Initial 
(nt) 


1721725 


1721780 


1722807 


1722870 


1723826 


1725439 


1726625 


1727170 


1730048 


1731542 


1732822! 


1734811 


1735056 


1738679 


1740569 


1741219 


1741313 


SEQ 

NO. 
(a.a.) 


5294 


5295 


5296 


5297 


5298 


5299 
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5301 


5302 
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03 



Function 




oxidoreductase 




NADH-dependent FMN reductase 


L-serine dehydratase 




alpha-glycerolphosphate oxidase 


histidyl-tRNA synthetase 


hydrolase 


cyclophilin 




hypothetical protein 




GTP pyrophosphokinase 


adenine phosphoribosyltransferase 


dipeptide transport system 


hypothetical protein 


protein-export membrane protein 




Matched 
length 
(aa) 




T — 
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CD 
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CN 
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CO 
CD 
LO 


CN 


x — 
x — 
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LO 




CO 
CN 




o 

CD 
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LO 

CO 


CO 


CO 
LO 
LO 


CN 
CO 
CO 




Similarity 
(%) 




88.1 




77.6 


71.4 




j 53.9 


72.2 


62.1 


61.1 




100.0 




99.9 


100.0 


98.8 


60.9 


57.2 




Identity 
(%) 




72.8 




37.1 


46.8 




28.4 


CN 
CO 


40.3 


35.4 




98.4 




99.9 


99.5 


98.0 


30.7 


25.9 




Homologous gene 




Streptomyces coelicolor A3(2) 
SCE15.13c 




Pseudomonas aeruginosa PA01 
slfA 


Escherichia coli K12 sdaA 




Enterococcus casseliflavus glpO 


Staphylococcus aureus 
SR17238 hisS 


Campylobacter jejuni 
NCTC11168Cj0809c 


Streptomyces chrysomallus 
sccypB 




Corynebacterium glutamicum 
ATCC 13032 orf4 




Corynebacterium glutamicum 
ATCC 13032 rel 


Corynebacterium glutamicum 
ATCC 13032 apt 


Corynebacterium glutamicum 
ATCC 13032 dciAE 


Mycobacterium tuberculosis 
H37Rv Rv2585c 


Escherichia coli K12 secF 




db Match 




gp:SCE15_13 




sp:SLFA_PSEAE 


_j 
o 
o 

LU 

J 

JZ 
Q 
CO 

CL 
CO 




prf:2423362A 


sp:SYH_STAAU 


gp:CJ1 1168X3 12 

\7 


prf:2313309A 




gp:AF038651_4 




gp:AF038651_3 


gp:AF038651_2 


gp:AF038651J 


sp:Y0BG_MYCTU 


sp:SECF_ECOLI 
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1750933 
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1741893 
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1744884 
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1747918 


1749276 
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1750427 


1750964 


1751497 


1752186 


1754894 


1755479 


1755748 


1757228 


1758797 


1759707 


SEQ 
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5311 


5312 


5313 


5314 


5315 


5316 


5317 


5318 


5319 


5320 


5321 


5322 ' 


5323 


5324 


5325 


5326 


5327 


5328 


5329 


O rS < 
LU O ^ 


CO 


CN 
r— 
CO 


CO 
CO 


GO 


LO 

T — 

CO 


CO 

T 

CO 


CO 


CO 
CO 


CO 
CO 


o 

CN 
CO 


CN 
CO 


CN 
CN 
CO 


CO 
CN 
CO 


CN 
CO 


LO 
CN 
CO 


CD 
CN 
CO 


r- 

CN 
OO 


CO 
CN 
CO 


CO 
CN 
CO 



-226 



o 



CD 
CL 



CD 

XT 

"o 



(0 
_Q) 

o o 
C -Q 
=3 *C 
• >* 

g> o 

M 
si 

o <d 



CD 
O- 

7a 



o 



o 

cl 



£ 

Q. 



O 



° • ^ 
fif 8 

CD £ C 

**— CO — 

£ O T3 

o £ 

X Q) O q 

d) o r l1 

j: ro a q. 



CD 

to 
to 

CD 



a 

CO 



CD 
CO 
CO 

— L in 
o c 

_l CO 
O -C 

Q_ 

CD So 

CO Q_ 
>« CD 
CO CO 



Q J 

O CL 



CD 

2 

CL 

E 

CO 



CO 

-♦— « 
CD 

c 



CD 

2 
Cl 

"co 
o 



o 

CL 



-a 

CD jz ^ 

" 1? S 

CO CD * — 



CD 
O 



O 
CO 



O 

co 

CN 



o 
r-- 



co 
cr> 

CN 



CO 



C0 



o 
o 



I" 

CO 



co 

CD 



CO 
CO 



CO 



CN 
CO 



co 



co 



co 



CO 
CO 
CO 



CD 

lo 



CN 

co 



CN 
CO* 
CO 



I — 
CN 



CO 



co 



co 



■a 

CD 



CD 

co 



o 

CO 

o 
o 

E 
o 
X 



CD 
CO 

a. 



LLI 



o 

> 

CN 
x — 



O 
CO 

ID 



-c Q 

CO CD 



LU >* LU 



O 
o 
co 

<D O 
O CD 
>h O 

£ LO 

° < 
0.0 
CD 

a 
co co 



CD 

3 

o 
E 02 

.3 CO 

c3 <§! 
co 

o > 

o a: 
o r-~- 
co 



CD 




CO 




co 




*> 




a) 




CD 




o 




CO 




CD 




yc 




E 






"o_ 


2 


CO 


ha 


O 


o 


CO 


CJ 


CO 


CO 


CN 


CO 


CO 



CD 
O 
O 

CO 
CD 
O 

>* O 

E <o 

O T_ 

a.oi 

CD — I 

.!= O 

CO CO 



CO 

c> 

3 

2 < 

CD « 
-O 

3 °- 

— ' o 
E ™ 

CD ^ 
CO ^ 

o a: 

>*co 
^ X 



_o 

3 
O 

i 

CD 
_Q 
3 

CJ 

E J2 

CD <$< 
CO 

_Q > 

o QC 
o 

>*co 
^ X 



CO 

m 



3 



CO 

m 



CO 



O 



Q 
CD 
O 



O 

o 

LU 

I 

o 
> 

ZD 

a: 



O 
o 

LU 

I 

o 

CD 
LU 



in 
< 
o 

T — 

o 

CO 



o 

m 
o 

X 
l: 
"cl 



CO 
< 
LU 



CL 

CD 



a 

CO 

CL 
CD 



o 
O 



O 

Q 



CO 



ZD 
CO 

o 
< 

CD 



St 



co 
co 

CO 



CO 
CO 
CO 



CO 
CO 



CN 
CD 



co 

CO 

o 



CO 
CO 
CD 



LO 
CD 



O 
CD 
CO 



CO 
LO 
O 
CN 



CD 
O 
CN 



CD 
LO 



in 
co 



E c 
^_ - — ' 

CD 



LO 
O 
O 

5 



co 

CD 



o 

CO 
CD 
CO 
CO 



CD 
CO 



co 
^* 

CD 
CD 

r- 



o 

CD 
CD 



co 
o 

CO 

CO 



CN 
CN 
O 
CD 
CO 



oo 

CD 
CD 
CO 



CN 

co 
o 
r-- 



co 
to 

CO 
CN 



co 

CD 
CO 
CO 

1^- 



m 



co 

CO 

5 



CD 
CO 
CO 
CO 
CO 



co 



CJ) 
CO 
CD 

in 
co 



co 

CD 
CO 
CO 



o 

CO 

o 

CO 
CO 



co 

CD 
CJ) 
CO 
CO 



CO 

h- 

CD 
CO 
CO 



O 
CO 

o 



co 

CD 

oo 
co 



co 

CO 



LO 



lu y 

CO 2 



CO 
CO 
LO 



CO 
CO 



LO 
CO 
CO 

in 



r-- 
co 

CO 

in 



co 
m 



CO 
CO 
CO 
LO 



o 
co 

LO 



CO 
LO 



CN 

co 

LO 



co 

CO 



m 

co 
in 



co 

co 
co 



co 
in 



LU 9 2 

co ^ o 



CO 
CO 



co 
co 



LO 
CO 
CO 



CO 
CO 



CO 
CO 
CO 



CJ) 
CO 
CO 



o 
co 



CO 



CN 
co 



CO 
CO 



LO 
CO 



CD 
CO 



CO 



-227 - 



Function 












puromycin N-acetyltransferase 






















ferric transport ATP-binding protein 










pantothenate metabolism 
flavoprotein 






Matched 
length 
(aa) 
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CN 
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Similarity 
(%) 












64.2 






















28.7 










66.7 






Identity 
(%) 












36.3 






















28.7 










27.1 






Homologous gene 












Streptomyces anulatus pac 






















Actinobacillus 
pleuropneumoniae afuC 










Zymomonas mobilis dfp 






db Match 
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sp:AFUC_ACTPL 
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CO 
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Terminal 
(nt) 


1777646 


1778037 


1778102 


1779554 


1780507 


1781019 


1782790 


1784381 


1783382 


1782894 


1785732 


1786907 


1789562 


1789768 


1790057 


1790461 


1792438 


1793426 


1793496 


1794820 


1795621 


1796181 


1797049 


1797769 


Initial 
(nt) 


1777269 


1777444 


1779508 


1780168 


1780905 


1781585 


1781705 


1783281 


1784080 


1785473 


1786844 


1788829 


1789080 


1789580 


1789746 


1790889 


1791842 


1792428 


1793654 


1793714 


1795202 


1795591 


1796186 


1797350 
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NO. 
(a.a.) 
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5352 
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Function 






































transposon TN21 resolvase 






protein-tyrosine phosphatase 






Matched 
length 
(aa) 






































CD 
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CO 






Similarity 
(%) 






































78.0 






51.8 






Identity 
(%) 






































51.1 






29.3 







Homologous gene 






































Escherichia coli tnpR 






Saccharomyces cerevisiae 
S288C YIR026C yvhl 






db Match 
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sp:PVH1_YEAST 
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1797850 


1798023 
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1800366 
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1801307 


1802096 


1802155 


1803419 


1803893 


1804598 


1804865 


1805599 


1806686 


1807396 


1808113 


1808421 


1808832 


1810372 


1811545 


1811938 


1812691 


1813606 


1812460 


Initial 
(nt) 


1797969 


1798757 


1799182 


1799473 


1800604 


1800834 


1801344 


1802577 


1802733 


1803465 


1804134 


1804629 


1804919 


1805727; 


1806917 


1807433 


1808137 


1808458 


1809761 


1810541 


1811564 


1812215 


1812881 


1812882 


SEQ 
NO. 
(a.a.) 


5372 


5373 


5374 


5375 


5376 


5377 


5378 


5379 


5380 


5381 


I5382 


5383 


5384 


5385 


5386 


5387 


5388 


5389 


5390 


5391 


5392 


5393 


5394 


5395 


O o < 


CN 

oo 


CO 
CO 


r-. 
co 


LO 
CO 


CD 

r-- 

CO 


1877 


CO 
CO 


CD 
CO 


o 

CO 
CO 


CO 
CO 


CN 
CO 

CO 


CO 
CO 
CO 


CO 

CO 


LO 
CO 
CO 


CD 
CO 
CO 


CO 
CO 


CO 
CO 
CO 


CD 
CO 
CO 


o 

CD 
CO 


CD 
CO 


CN 
CD 
CO 


CO 
CD 
CO 


CD 
CO 


LO 
CD 
CO 
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c 
o 
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CD -C ^ 
03 CD 



CO 

I- 

CO 



■o 

CD 



c 
o 
o 



JQ 



CD 



o 

o 

£ 

o 
X 



E c= 
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o 
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o 
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o 
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CO 
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3 
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< 
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00 
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co 
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CD 
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CO 

£ 
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O 



in 

co 

CM 
CM 
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O 
CN 
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O 
CM 
CO 
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o 

LO 
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o 
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to 
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CO 
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CD 
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o 
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E 

O 

E 

CO 

E 

D 
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.a 
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CO 

o 

CO 
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in 
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CO 
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CO 
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CO 



3 
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k_ 

CO 
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E 

CD 
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o 
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CO 
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"o 
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.O 
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c 
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o 

CO 
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o 
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Q_ 
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LU 
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CL 
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CD 
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O 
O 



Function 








helicase 




phage N15 protein gp57 




















actin binding protein with SH3 
domains 










ATP/GTP binding protein 




ATP-dependent CIp proteinase ATP- 
binding subunit 


Matched 
length 
(aa) 








o 

CM 
CO 




CO 

o 




















CN 
CN 










TT 

co 




o 

CO 
CD 


Similarity 
(%) 








44.7 




64.2 




















49.8 










52.5 




61.0 


Identity 
(%) 








22.1 




36.7 




















28.7 










23.6 




30.2 


Homologous gene 








Mycoplasma pneumoniae ATCC 
29342 yb95 




Bacteriophage N15 gene57 




















Schizosaccharomyces pombe 
SPAPJ760.02c 










Streptomyces coeiicolor 
SC5C7.14 




Escherichia coli K12 clpA 


db Match 








sp:Y018_MYCPN 




pir:T13144 




















gp:SPAPJ760_2 ' 










gp:SC5C7J4 




sp:CLPA_ECOLI 


ORF 


3789 


TT 


CO 

m 


1839 


to 
co 


CO 
CO 
CO 


CD 
CD 
CO 


CO 

to 


CO 
LO 


CO 
CN 

in 


CO 
CO 


CD 
CO 


CN 
CO 


CO 
CO 


CD 

r-- 

LO 


1221 


CN 
LO 
CO 


1395 


CO 

m 


o 

CO 


1257 


1854 


1965 


Terminal 
(nt) 


1842137 


1842681 


1843337 


1845356 


1845857 


1846207 


1846333 


1847932 


1848474 


1849036 


1849785 


1849966 


1850406 


1849978 


1850474 


1852440 


1852324 


1853873 


1854854 


1855237 


1856788 


1858738 


1860727 


Initial 
(nt) 


1838349 


1842235 


1842804 


1843518 


1845483 


1845872 


1846698 


1847315 


,1847938 


11848509 


1848988 


1849781 


1850035 


1850415 


1851049 


1851220 


1851473 


1852479 


1854261 


1855058 


1855532 


1856885 


1858763 


SEQ 
NO. 
(a.a.) 


5418 


5419 


5420 


5421 


5422 


5423 


5424 


5425 


5426 


5427 


5428 


5429 


5430 ! 


5431 


5432 


5433 


5434 


5435 


5436 


5437 


5438 


5439 


5440 
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OO 

co 


CO 
CO 


o 
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CJ) 
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CO 
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CN 
CO 
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CN 
CO 


CN 
CO 


in 

CN 
CO 


CD 
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CO 
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CN 
CO 


CO 
CN 
CO 


CO 
CN 
CO 


o 

CO 
CO 


T — 

CO 
CD 


CM 

CO 
CO 


CO 
CO 
CO 


co 

CO 


LO 
CO 
CO 


CO 
CO 
OO 


CO 
CO 


OO 
CO 
CO 


CO 
CO 
CO 


o 

XT 

CO 



Function 










ATP-dependent heiicase 










hypothetical protein 


deoxynucleotide monophosphate 
kinase 










type II 5-cytosoine 
methyltransferase 


type II restriction endonuciease 






hypothetical protein 
















































Matche 
length 
taaj 










CO 
CO 
CD 










CN 
CN 


CO 

0 

CN 










CO 
CO 
CO 


CO 
LO 
CO 






0 

LO 




milarity 
(%) 




















































CO 










CO 


LO 
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CO 












LO 
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CO 










co" 

CO 


CO 
CO 






LO 
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CO 
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CO 
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CN 
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CD 




lg 










CN 










LO 
CN 


co 










CO 
CO 


CO 
CO 






CN 




Homologous gene 










Staphylococcus aureus SA20 
pcrA 










Streptomyces coelicolor A3(2) 
SCH17.07C 


Bacteriophage phi-C31 gp52 










Corynebacterium glutamicum 
ATCC 13032 cgllM 


Corynebacterium glutamicum 
ATCC 13032 cgllR 






Streptomyces coelicolor A3(2) 
SC1A2.16c 




db Match 










sp:PCRA_STAAU 










gp:SCH17_7 


prf:2514444Y 










prf:2403350A 


pir:A55225 






gp:SC1A2_16 




ORF 

(DP) 


r*- 


CD 
LO 


CN 
CO 


CN 
CO 


2355 


CO 
LO 
LO 


CO 
CO 


LO 
CO 


CO 
CN 


ILL 


CN 
O 


LO 

CN 
CN 


2166 


CO 
CN 


6507 


1089 


1074 


1521 


Li I 


1818 


CD 
CO 


Terminal 
(nt) 


1861225 


1861475 


1861519 


1862399 


1865299 


1865822 


1866219 


1866792 


1867095 


1867874 


1868587 


1868671 


1868927 


1871101 


1871380 


1879400 


1880485 


1882470 


1884220 


1887047 


1887590 


Initial 
(nt) 


1860752 


1861320 


1861842 


1862088 


1862945 


1865265 


1865842 


1866328 


1866832 


1867098 


1867886 

i 


1868895 


1871092 


1871373 


1877886 


1878312 


1879412 


1883990 


1884936 


1885230 


1887405 


SEQ 

NO. 
(a.a.) 


5441 


5442 


5443 


5444 


5445 


5446 


5447 


5448 


5449 


5450 


5451 


5452 


5453 


5454 


5455 


i 
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5457 


5458 


5459 


5460 


5461 
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co 


CN 
CO 


CO 

co 


CO 


LO 

co 


CD 
"3- 
CO 


CO 


CO 
T 
CO 


CO 

co 


0 

LO 

CO 


LO 
CO 


CN 
LO 
CO 


CO 
LO 
CO 


LO 
CO 


LO 
LO 

CO 


CO 
LO 
CO 


r^- 

LO 

CO 


1958 


CO 
LO 
CO 


0 

CO 
CO 


CO 
CO 



-2 



32 



"O 
CD 



Function 


SNF2/Rad54 helicase-related 
protein 


hypothetical protein 




hypothetical protein 








endopeptidase CIp ATP-binding 
chain B 














nuclear mitotic apparatus protein 




















Matched 
length 
(aa) 


o 

CD 


CO 
CD 




co 

LO 








CN 














1004 






































































'ZZ. m ^ 

I* 


70.0 


56.4 




47.9 








52.5 














CD 
-<T 




















CO 
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CO 


































is 


CD 


CO* 
CO 




o 

CM 








LO 

CN 














O 
CM 
























CD 




CO 










































Homologous gene 


Deinococcus radiodurans 
DR1258 


Lactobacillus phage phi-gl 
Rorf232 




Bacillus anthracis pX02-1 








| Escherichia coli cIpB 














Homo sapiens numA 




















db Match 


gp:AE001973_4 


pir:T13226 




gp:AF188935_16 








sp:CLPB_ECOLI 














pir:S23647 




















n 


LO 
CO 


CD 
CO 


O 
CO 
CO 


1680 


1206 


1293 


2493 


1785 


CM 
CD 


1113 


CO 

co 


CO 
CD 


CD 

r- 
co 


CO 
CD 

T — 


2766 


O 
O 

to 


1251 


CO 
CD 
CO 




1008 


1659 


1488 


CD 
CD 
CO 


1509 


Terminal 
(nt) 


1887688 


1888231 


1889859 


1890028 


1891832 


1893388 


1894739 


1897374 


1899233 


1899804 


1901066 


1902955 


1902005 


1903225 


1903113 


1905973 


1906664 


1907965 


1908785 


1909501 


1910642 


1912333 


1913973 


1914725 


Initial 
(nt) 


1888038 


1889094 


1889530 


1891707 


1893037 


1894680 


1897231 


1899158 


1899853 


1900916 


1901911 


1901975 


1902883 


1903028 


1905878 


1906572 ! 


1907914 


1908660 


1909498 


1910508 


1912300 


1913820 


1914371 


1916233 


SEQ 
NO. 
(a-a.) 


5462 


5463 


5464 


5465 


5466 


5467 


5468 


5469 


5470 


| 5471 


5472 


5473 


5474 


5475 


5476 


5477 


5478 


5479 


5480 


5481 


5482 


5483 


5484 


5485 


C o < 
LU § z 


CM 
CO 
CD 


CO 
CD 
CD 


CO 
CD 


LO 
CO 
CD 


CD 
CO 
CD 


CO 
CD 


CO 
CD 
CD 


CD 
CD 
CD 


o 

CD 


CD 


CM 
CD 


CO 

r— 

CD 


CD 


LO 
CD 


CD 

r-~ 

CD 


CD 


CO 
CD 


CD 
CD 


o 

CO 
CD 


CO 
CD 


CM 
CO 
CD 


CO 
CO 
CD 


CO 
CD 


LO 
CO 
CD 



3 



CD 



Function 




















submaxillary apomucin 






modification methylase 










hypothetical protein 






hypothetical protein 








Matched 
length 
(aa) 




















1408 






CD 
















co 
CN 
CO 








Similarity 
(%) 




















49.2 






65.6 










58.8 






54.6 








Identity 
(%) 




















CN 
CO 
CN 






42.6 










38.6 






27.1 








Homologous gene 




















Sus scrofa domestica 






Escherichia coli ecoR1 










Mycobacterium tuberculosis 
H37Rv Rv1956 






Methanococcus jannaschii 
MJ0137 








db Match 




















pir:T03099 






sp:MTE1_ECOLI 










pir:H70638 






sp:Y137_METJA 








ORF 


o 

CD 

CO 


CN 
CN 
CN 


CN 
CO 


LD 
CO 


CD 
LO 


CD 
<T 
LO 


O 
CO 
CD 


CD 
O 
CO 


r-- 

LO 
CO 


4464 


CD 
LO 


LO 
CD 




LO 

co 


1821 


o 

CN 


CO 
CD 


CO 
CO 


o 

LO 


co 

CO 


CN 
^- 
CD 


CN 
CO 


o 

CN 


CO 
LO 


Terminal 
(nt) 


1916733 


1917165 


1917329 


1917564 


1918703 


1919646 


1920347 


1925695 


1926038 


1921547 


1926259 


1927245 


1928381 


1928908 


1929059 


1930990 


1931421 


1931935 


1932373 


1933522 


1934971 


1936849 


1937411 


1937486 


Initial 
(nt) 


1916374 


1916944 


1917640 


1918208 


1919461 


1920194 


1921276 


1925390 


1925682 


1926010 


1926837 


1928189 


1928211 


1928534 


1930879 


1931190 


1931888 


1932315 


1932879 


1934358 


1935912 


1936226 


1937202 


1938019 


So - 

CO ^ -2. 


5486 


5487 


5488 


5489 


5490 


5491 


5492 


5493 


5494 


5495 


5496 


5497 


5498 


5499 


5500 


5501 


5502 


5503 


5504 


5505 


5506 


5507 


5508 


5509 


SEQ 
NO. 
(DNA) 


CD 
CO 
CD 


r- 

CO 
CD 


CO 
CO 
CO 


CD 
CO 
CD 


o 

CD 
CD 


CD 
CD 


CN 
CD 
CD 


CO 
CD 
CD 


CD 
CD 


LO 
CD 
CD 


CO 
CD 
CD 


r- 

CD 
CD 


CO 
CD 
CD 


CD 
CD 
CD 


2000 


2001 


2002 ! 


2003 


2004 


2005 


2006 


2007 


2008 


2009 



234 































protein 
















protein 
































PSi 
















PS1 




Function 




















surface protein 








major secreted protein 
precursor 






DNA topoisomerase III 










major secreted protein 
precursor 




Matched 
length 
(aa) 




















o 

CO 








o 

CN 






CD 
LO 










co 






















































































CD 










h- 




CO 

!* 




























LO 






O 
LO 










^' 

LO 




CO 
















































>> 




















o 














GO 










r-~ 




1* 




















CO 
CM 








o 

CO 






CO 
CN 










CD 
CM 




Homologous gene 




















Enterococcus faecalis esp 








Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
17965 cspl 






Escherichia coli topB 










Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
17965 cspl 




db Match 




















prf:2509434A 








sp:CSP1_CORGL 






sp:TOP3_ECOLI 










sp:CSP1_CORGL 






1191 


^- 

CO 
LO 


CO 
CO 
LO 




CO 
LO 


CO 

o 

CO 


CD 
CM 


CD 
O 
CO 


LO 
CO 
CO 


CO 
CN 
CO 


CD 
CM 


CO 
CO 


CD 
CM 


1581 


2430 


CD 
CO 


2277 


2085 


CD 
CO 


CN 
CO 


T 


1887 


CD 
CM 


Terminal 
(nt) 


1940135 


1938531 


1940844 


1941550 


1941732 


1942812 


1943310 


1943653 


1944564 


1944608 


1945595 


1945952 


1946609 


1947070 


1949021 


1951619 


1952546 


1956203 


1958450 


1959765 


1960371 


1961114 


1963139 


Initial 
(nt) 


1938945 


1939064 


1940257 


1941107 


1942484 


1942510 


1943095 


1943345 


1943680 


1945435 


1945891 


1946332 


1947037 


1948650 


1951450 


1952485 


1954822 


1958287 


1959340 


1960196 


1961114 


1963000 


1963429 


SEQ 

NO. 
(a.a.) 


5510 


5511 


5512 


5513 


5514 


5515 


5516 


5517 


5518 


5519 


5520 


5521 


5522 


5523 


5524 


5525 


5526 


5527 


5528 


5529 


5530 


5531 


5532 


SEQ 

NO. 
(DNA) 


2010 


2011 
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2013 


2014 


2015 


2016 


2017 
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2019 


2020 


2021 
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2026 
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2028 
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2032 
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Function 


sodium-dependent transporter 


hypothetical protein 






riboflavin biosynthesis protein 


potential membrane protein 


methionine sulfoxide reductase 




hypothetical protein 


hypothetical protein 


ribonuclease D 


1-deoxy-D-xyIulose-5-phosphate 
synthase 


RNA methyltransferase 




hypothetical protein 


deoxyuridine 5-triphosphate 
nucleotidohydrolase 


hypothetical protein 
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length 
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62.7 


82.1 
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Identity 
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42.5 


41.3 




55.2 


55.7 


25.9 


55.3 


25.4 




38.1 


55.0 


46.0 




Homologous gene 


Helicobacter pylori 26695 
HP0214 


Bacillus subtilis yxaA 






Mycobacterium tuberculosis 
H37Rv Rv2671 ribD 


Mycobacterium tuberculosis 
H37Rv Rv2673 


Streptococcus gordonii msrA 




Mycobacterium tuberculosis 
H37Rv Rv2676c 


Mycobacterium tuberculosis 
H37Rv Rv2680 


Haemophilus influenzae Rd 
KW20 HI0390 rnd 


Streptomyces sp. CL190 dxs 


Thermotoga maritima MSB8 
TM1094 




Mycobacterium tuberculosis 
H37Rv Rv2696c 


Streptomyces coelicolor A3(2) 
SC2E9.09 dut 


Mycobacterium tuberculosis 
H37Rv Rv2698 
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Function 


tRNA delta-2- 

isopentenylpyrophosphate 

transferase 




hypothetical protein 






hypothetical membrane protein 


hypothetical protein 


glutamate transport ATP-binding 
protein 


Neisserial polypeptides predicted to 
be useful antigens for vaccines and 
diagnostics 


glutamate transport system 
permease protein 


glutamate transport system 
permease protein 


regulatory protein 


hypothetical protein 




biotin synthase 


putrescine transport ATP-binding 
protein 


hypothetical membrane protein 
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length 
(aa) 
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99.6 


66.9 
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61.4 
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58.8 


Identity 
(%) 


40.0 




48.5 
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68.4 


99.6 


0'99 


100.0 
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34.'5 


40.3 




33.0 


33.2 


24.6 


Homologous gene 


Escherichia coli K12 miaA 




Mycobacterium tuberculosis 
H37Rv Rv2731 






Mycobacterium tuberculosis 
H37Rv Rv2732c 


Mycobacterium leprae 
B2235_C2J95 


Corynebacterium glutamicum 
ATCC 13032 gluA 


Neisseria gonorrhoeae 


Corynebacterium glutamicum 
ATCC 13032 gluC 


Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
13032 gluD 


Mycobacterium leprae recX 


Mycobacterium tuberculosis 
H37Rv Rv2738c 




Bacillus sphaericus bioY 


Escherichia coli K12 potG 


Bacillus subtilis ybaF 


db Match 
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NO. 
(a.a.) 
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Function 


transcriptional accessory protein 


sporulation-specific degradation 
regulator protein 


dicarboxylase translocator 


2-oxoglutarate/maIate translocator 


3-carboxy-cis t cis-muconate 
cycloisomerase 








tRNA(guanine-NI)- 
methyltransferase 


hypothetical protein 


16S rRNA processing protein 


hypothetical protein 


30S ribosomal protein S16 


inversin 


ABC transporter 


ABC transporter 


signal recognition particle protein 








cell division protein 


Matched 
length 
(aa) 


CO 


CO 
CO 


CO 
LO 


LO 
CO 


o 

LO 
CO 








CO 
CN 


o 

CN 


CN 


CO 
CO 


CO 
CO 


CO 
CO 


CO 
LO 
CN 


CO 
CO 


CO 
LO 
LO 








LO 
O 
LO 


Similarity 
(%) 


78.7 


65.3 


78.3 


80.0 


66.3 








64.8 


57.6 


72.1 


66.7 


79.5 


61.7 


69.1 


63.8 


78.2 








66.1 


Identity 
(%) 


56.6 


27.0 


45.8 


40.0 


39.1 








34.8 


30.5 


52.3 


29.0 


47.0 


32.1 


26.6 


35.5 


58.7 








37.0 


Homologous gene 


Bordetella pertussis TOHAMA 1 
tex 


Bacillus subtilis 168 degA 


Chlamydophila pneumoniae 
CWL029 ybhl 


Spinacia oleracea chloroplast 


Pseudomonas putida pcaB 








Escherichia coli K12trmD 


Streptomyces coelicolor A3(2) 
SCF81.27 


Mycobacterium leprae 
MLCB250.34. rimM 


Helicobacter pylori J99 jhp0839 


Bacillus subtilis 168 rpsP 


Mus musculus inv 


Streptococcus agalactiae cylB 


Pyrococcus horikoshii OT3 mtrA 


Bacillus subtilis 168 ffh 








Escherichia coli K12ftsY 


db Match 


sp:TEX_BORPE 


pir:A36940 


pir:H72105 


prf:2108268A 


sp:PCAB_PSEPU 








sp:TRMD_ECOLI 


gp:SCF81_27 


sp:RIMM_MYCLE 


pir;B71881 


pir:C47154 


pir:T14151 


prf;2512328G 


prf:2220349C 


sp:SR54_BACSU 








_j 
o 
a 

LU 

>' 

CO 

\- 

LL 

a_ 
to 


SI 


2274 


LO 

h- 

CD 


1428 


CO 
CN 


1251 


CO 
CO 


CO 
CO 
CO 


o 

CO 
CO 


CO 
CO 


CO 
CO 


CO 
LO 


CO 

co 


LO 
CO 


CD 
LO 


r-- 

CD 
CO 


CD 

co 


1641 


CO 
CO 
CO 




CO 
CO 
CD 


1530 


Terminal 
(nt) 


2154460 


2156747 


2157754 


2159019 


2159287 


2160768 


2161111 


2161507 


2162196 


2163745 


2163748 


2164737 1 


2164815 


2166098 


2166124 


2166990 


2167944 


2171058 


2172131 


2172877 


2173759 


Initial 
(nt) 


2156733 


2157721 


2159181 


2159237 


2160537 


2160670 


2161503 


2162196 


2163014 


2163098 


2164260 


2164390 


2165309 


2165523 


2166990 


2167865 


2169584 


2170426 


2171715 


2172209 


2175288 


SEQ 
NO. 
(a.a.) 


5739 


5740 


5741 


5742 


5743 


5744 


5745 


5746 


5747 


5748 


5749 


5750 


5751 


5752 


5753 


5754 


5755 


5756 


5757 


5758 


5759 


SEQ 
NO. 
(DNA) 


2239 


2240 


2241 


2242 


2243 


2244 


2245 


2246 


2247 


2248 


2249 ; 


2250 


2251 


2252 


2253 


2254 


2255 


2256 


2257 


2258 


2259 



-2 



Function 






glucan 1,4-alpha-glucosidase or 
glucoamylase S1/S2 precursor 




chromosome segregation protein 


acylphosphatase 




transcriptional regulator 


hypothetical membrane protein 






cation efflux system protein 


formamidopyrimidine-DNA 
glycosylase 


ribonuclease III 


hypothetical protein 


hypothetical protein 


transport protein 


ABC transporter 


hypothetical protein 




Matched 
length 
(aa) 






1144 




1206 


CM 
CD 




LO 

o 

CO 


LO 

CM 






CO 
CO 


LO 
CO 
CM 


CM 
CM 


CO 


CO 
CO 
CM 


CD 
LO 

to 


LO 


CO 
CO 
CO 




Similarity 
(%) 






46.2 




72.6 


73.9 




60.0 


73.5 






76.6 


66.7 


76.5 


62.5 


76.9 


55.6 


58.8 


62.6 




Identity 
(%) 






22.4 




48.3 


51.1 




23.9 


39.3 






46.8 


36.1 


40.3 


35.8 


50.0 


28.3 


26.6 


35.3 




Homologous gene 






Saccharomyces cerevisiae 
S288C YIR019C stal 




Mycobacterium tuberculosis 
H37Rv Rv2922c smc 


Mycobacterium tuberculosis 
H37Rv RV2922.1C 




Escherichia coli K12yfeR 


Mycobacterium leprae 
MLCL581.28c 






Dichelobacter nodosus gep 


Escherichia coli K12 mutM or 


Bacillus subtilis 168 rncS 


Mycobacterium tuberculosis 
H37Rv Rv2926c 


Mycobacterium tuberculosis 
H37Rv Rv2927c 


Sftreptomyces verticitlus 


Escherichia coli K12 cydC 


Streptomyces coelicolor A3(2) 
SC9C7.02 




db Match 






sp:AMYH_YEAST 




=D 
h- 
O 
> 

I 

CD 
CD 
O 
> 

CL 
CO 


=) 
i— 

o 
> 

i 

CL 
> 

o 
< 

CL 
CO 




sp:YFER_ECOLI 


pir:S72748 






gp:DNINTREG_3 


sp:FPG_ECOLI 


pir:B69693 


ID 
\— 
O 
> 

LL f 
CD 
O 
> 

cL 

CO 


sp:Y06G_MYCTU 


prf:2104260G 


o 
o 

LU 

I 

o 

Q 
> 
O 
cL 

CO 


gp:SC9C7_2 




ORF 

( D P) 


CD 

LD 


CM 
O 

r^- 


3393 


CO 
CD 
CD 


3465 


CM 
CO 
CM 


1854 


CO 
LO 
CO 


CO 
CO 


CO 
CO 

T— 




LO 

■\ — 

CD 


CO 
LO 
CO 




CO 
LO 


CD 
CO 

1"-- 


1644 


1530 


1122 


5- 


Terminal 
(nt) 


2175888 


2177103 


2176110 


2181880 


2179628 


2183110 


2183405 


2185351 


2187129 


2187342 


2187233 


2187692 


2188313 


2189166 


2189906 


2190540 


2193165 


2194694 


2198004 


2198007 


Initial 
(nt) 


2176046 


2176402 


2179502 


2180918 


2183092 


2183391 


2185258 


2186208 


2186299 


2187160 


2187679 


2188306 


2189170 


2189906 


2190439 


2191328 


2191522 


2193165 


2196883 


2198447 


SEQ 
NO. 
(a.a.) 


5760 


5761 


5762 


5763 


5764 


5765 


5766 


5767 


5768 


5769 


5770 


5771 


5772 


5773 


5774 


5775 


5776 


5777 


5778 


5779 


SEQ 
NO. 
(DNA) 


2260 


2261 


2262 


2263 


2264 


2265 


2266 


2267 


2268 


2269 


2270 


2271 


2272 


2273 


2274 


2275 


2276 


2277 


2278 


2279 
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ase 










chloramphenicol resistance protein 
or transmembrane transport protein 


Function 


hypothetical protein 


peptidase 


sucrose transport protein 






maltodextrin phosphorylase/ 
glycogen phosphorylase 


hypothetical protein 


prolipoprotein diacylglyceryl 
transferase 


indole-3-glycerol-phosphate 
synthase / anthranilate synthase 
component II 


hypothetical membrane protein 


phosphoribosyl-AMP cyclohydrol 


cyclase 


inositol monophosphate 
phosphatase 


phosphoribosylformimino-5- 
aminoimidazole carboxamide 
ribotide isomerase 


glutamine amidotransferase 


"D 

« 1? 2 


LO 

o 


CO 
LO 
CO 


CO 
CO 






t — 

CO 


LO 
CD 
CN 


CO 
CN 


CD 

to 


CO 
CN 
CN 


CO 
CO 


CO 
LO 
CN 


CN 


LO 
CN 


o 

CN 


CN 
O 

^j- 




































>> 


































Similarii 
(%) 


43.7 


64.3 


51.9 






67.4 


66.4 


65.5 


CN 
CO 


58.8 


79.8 


97.7 


94.0 


97.6 


92.4 


54.0 


Identity 
(%) 


21.0 


32.9 


27.1 






36.1 


33.9 


31.4 


29.6 


29.4 


52.8 


97.3 


94.0 


95.9 


86.7 


25.6 


Homologous gene 


Thermotoga maritima MSB8 
TM0896 


Campylobacter jejuni ATCC 
43431 hipO 


Arabidopsis thaliana SUC1 






Thermococcus litoralis malP 


Bacillus subtilis 168 yfiE 


Staphylococcus aureus FDA 485 


Emericella nidulans trpC 


Mycobacterium tuberculosis 
H37Rv Rv1610 


Rhodobacter sphaeroides ATCC 
17023 hisi 


Corynebacterium glutamicum 
AS019 hisF 


Corynebacterium glutamicum 
AS019impA 


Corynebacterium glutamicum 
AS019hisA 


Corynebacterium glutamicum 
AS019 hisH 


Streptomyces lividans 66 cmIR 


db Match 


pir:A72322 


sp:HIPO_CAMJE 


pir:S38197 






prf:2513410A 


=> 
CO 

o 
< 

CD 

u. 1 

LL 
> 

CL 
CO 


sp:LGT_STAAU 


sp:TRPG_EMENI 


pir:H70556 


sp:HIS3_RHOSH ! 


sp:HIS6_CORG 


prf:2419176B 


gp:AF051846_1 


gp:AF060558_1 


_j 

\- 

co 

o 

CL 

co 


ORF 


1284 


1263 


CO 
CO 
CO 


LO 

CO 


to 
r«- 

CN 


2550 


O 
O 
CO 


CO 

CO 


o 

CO 


LO 
CO 


LO 
CO 




LO 
CN 
CO 


CO 
CO 

r- 


CO 
CO 
CO 


1266 


Terminal 
(nt) 


2199758 


2201070 


2201073 


2201450 


2201594 


2201992 


2204591 


2207302 


2208367 


2209232 


2209920 


2210273 


2211051 


2211882 


2212641 


2214321 


Initial 
(nt) 


2198475 


2199808 


2201408 


2201584 


2201869 


2204541 


2205490 


2208249 


2209167 


2209888 


2210273 


2211046 


2211875 


2212619 


2213273 


2215586 


SEQ 
NO. 
(a.a.) 


5780 


5781 


5782 


5783 


5784 


5785 


5786 


5787 


5788 


5789 


5790 


5791 


5792 


5793 


5794 


5795 


SEQ 
NO. 
(DNA) 


2280 


2281 


2282 


2283 


2284 


2285 


2286 


2287 


2288 


2289 


2290 


2291 


2292 


2293 


2294 


2295 



-2^9 



• 

o 



Function 




imidazoleglycerol-phosphate 
dehydratase 


histidinol-phosphate 
aminotransferase 


histidinol dehydrogenase 


serine-rich secreted protein 






histidine secretory acid phosphatase 


tet repressor protein 


glycogen debranching enzyme 


hypothetical protein 


oxidoreductase 


myoinositol 2-dehydrogenase 


galactitol utilization operon repressor 


ferrichrome transport ATP-binding 
protein or ferrichrome ABC 
transporter 


hemin permease 


iron-binding protein 


iron-binding protein 


hypothetical protein 


Matched 
length 
(a-a) 




CO 

o 


CM 
CD 
CO 


CD 
CO 
^J" 


CN 

co 






CM 


o 

CN 


CN 
CM 

r-- 


CO 
LO 
CN 


co 
CO 
CN 


CO 

co 


CD 
CN 
CO 


CO 
CN 


CM 
CO 
CO 


CO 

o 


CN 
CO 


CO 

T — 


Similarity 
(%) 




81.8 


79.3 


85.7 


54.4 






59.7 


60.8 


LO 
LO 


76.0 


55.2 


60.9 


64.4 


68.3 


71.1 


68.0 


67,6 


73.5 


Identity 
(%) 




52.5 


57.2 


63.8 


27.2 






CD 
CN 


28.9 


47.4 j 


o 
o 

LO 


29.9 


35.0 


30.4 


32.9 


CO 

to 

CO 


30.1 


34.6 


38.1 


Homologous gene 




Streptomyces coelicolor A3(2) 
hisB 


Streptomyces coelicolor A3(2) 
hisC 


Mycobacterium smegmatis 
ATCC 607 hisD 


Schizosaccharomyces pombe 
SPBC215.13 






Leishmania donovani SAcP-1 


Escherichia coli plasmid RP1 
tetR 


Sulfolobus acidocaldarius treX 


Mycobacterium tuberculosis 
H37Rv Rv2622 


Streptomyces coelicolor A3(2) 
SC2G5.27c gip 


Sinorhizobium meliloti idhA 


Escherichia coli K12 gaIR 


Bacillus subtilis 168 fhuC 


Vibrio cholerae hutC 


Bacillus subtilis 168 yvrC 


Bacillus subtilis 168yvrC 


Escherichia coli K12ytfH 


db Match 




sp:H!S7_STRCO 


sp:HIS8_STRCO 


CO 

o 
>- 

x l 

CO 
X 
cL 

LO 


gp:SPBC215_13 






prf:2321269A 


pir:RPECR1 


prf;2307203B 


pir:E70572 


gp:SC2G5_27 


prf;2503399A 


sp:GALR_ECOLI 


sp:FHUC_BACSU 


prf:2423441E 


pirG70046 


pir:G70046 


_i 
o 
o 

LU 

1 

X 
LL 
h- 
> 
CL 
CO 


ORF 

1&PJ 


LO 
CN 
CM 


CD 
O 
CD 


1098 


1326 


1200 


CO 
CD 


CD 
O 
CO 


CN 
CO 


T — 

CO 
LO 


2508 


o 

CO 


■*r 


1011 


CO 
CD 
CD 


CO 
CD 

r-~- 


1038 


CO 

co 


CD 
LO 




Terminal 
(nt) 


2215639 


2215869 


2216494 


2217600 


2220358 


2220459 


2221919 


2221187 


2222518 


2225035 


2225949 1 


2225990 


2226769 


2228901 


2229099 


2229900 


2230947 


2231339 


2232016 


Initial 
(nt) 


2215863 


2216474 


2217591 


2218925 


2219159 


2221109 


2221611 


2221828 


2221958 


2222528 


2225149 


2226763 


2227779 


2227906 


2229896 


2230937 


2231294 


2231932 


2232456 


SEQ 
NO. 
(a.a.) 


5796 


5797 


5798 


5799 


5800 


5801 


5802 


5803 


5804 


5805 




5806 


5807 


5808 


5809 


5810 


5811 


5812 


5813 


5814 


SEQ 
NO. 
(DNA) 


2296 


2297 


2298 


2299 


2300 


2301 


2302 


2303 


2304 


2305 


2306 


2307 


2308 


2309 


2310 


2311 


2312 


2313 


2314 
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Function 


D-glutamate racemase 




bacterial regulatory protein, marR 
family 


hypothetical membrane protein 




endo-type 6-aminohexanoate 
oligomer hydrolase 


hypothetical protein 


hypothetical protein 




hypothetical protein 




ATP-dependent helicase 


hypothetical membrane protein 


hypothetical protein 


phosphoserine phosphatase 




cytochrome c oxidase chain I 
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SCE22.22 


Mycobacterium tuberculosis 
H37Rv Rv1337 




Flavobacterium sp. nylC 
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Mycobacterium tuberculosis 
H37Rv Rv1331 




Mycobacterium tuberculosis 
H37Rv Rv1330c 




Escherichia coli dinG 


Mycobacterium tuberculosis 
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Streptomyces coelicolor A3(2) 
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Function 


hypothetical membrane protein 


hypothetical membrane protein 


hypothetical protein 


transposase (IS1676) 


major secreted protein PS1 protein 
precursor 








transposase (IS1676) 




proton/sodium-glutamate symport 
protein 




ABC transporter 




ABC transporter ATP-binding protein 


hypothetical protein 


hypothetical protein 




oxidoreductase or dehydrogenase 
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length 
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CO 
CO 




CO 

CN 


CO 


CN 




CD 
CO 

T — 












































64.3 


61.5 


oS 


48.6 


49.6 








46.6 




66.2 




69.0 




79.8 


67.0 


75.0 




54.1 


CO 








































Identity 
(%) 


41.7 


25.4 


51.2 


24.2 


24.8 








24.6 




30.8 




33.0 




45.4 


60.0 


71.0 




28.1 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv3069 


Helicobacter pylori J99 jhpl 146 


Bacillus subtilis 168 ycsl 


Rhodococcus erythropolis 


Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
17965 cspl 








Rhodococcus erythropolis 




Bacillus subtilis 168 




Streptomyces coelicolor A3(2) 
SCE25.30 




Staphylococcus aureus 


Chtamydophila pneumoniae 
AR39 CP0987 


Chlamydia muridarum Nigg 
TC0129 




Streptomyces collinus Tu 1892 
ansG 


db Match 


pir:F70650 


pir:D71843 


sp:YCSI_BACSU 


gp:AF126281_1 


sp:CSP1_CORGL 








gp:AF126281_1 




< 
o 

CO 
(— ' 

o 

CL 

to 




gp:SCE25_30 




gp:SAU18641_2 


PIR:F81516 


PIR:F81737 




prf:2509388L 


si 


CO 
CO 
CN 


CN 
CO 


CN 
CD 

r-. 


1365 


1620 


LO 

CO 


LO 
CO 


r*- 


1401 


CO 
CD 


1338 


CO 
CO 
CO 


2541 


CO 
CO 


CO 

o 


CO 
CN 




CO 

co 


CN 
h- 
co 


Terminal 
(nt) 


2690437 


2690760 


2691564 


2693053 


2694918 


2695279 


2695718 


2695320 


2697212 


2697383 


2698194 


2701612 


2699926 


2703356 


2702487 


2704586 


2704975 


2710555 


2711308 


Initial 
(nt) 


2690150 


2690437 


2690773 


2691689 


2693299 


2694926 


2695554 


2695766 


2695812 


2698150 


2699531 


2700920 ' 


2702466 


2702466 


2703194 


2704314 


2704835 


2709878 


2710637 


SEQ 

NO. 
(a.a.) 


6290 


6291 


6292 


6293 


6294 


6295 


6296 


6297 


6298 


6299 


6300 


6301 


6302 


6303 


6304 


6305 


6306 


6307 


6308 


SEQ 

NO. 
(DNA) 


2790 


2791 


2792 


2793 


2794 


2795 


2796 


2797 


2798 


2799 


2800 


2801 ; 


2802 


2803 


2804 


2805 


2806 


2807 


2808 



Function 


methyltransferase 


hypothetical protein 


hypothetical protein 




UDP-N-acetylglucosamine 1- 
carboxyvinyltransferase 


hypothetical protein 


transcriptional regulator 




cysteine synthase 


O-acetylserine synthase 


hypothetical protein 


succinyl-CoA synthetase alpha 
chain 


hypothetical protein 


succinyl-CoA synthetase beta chain 




frenolicin gene E product 




succinyl-CoA coenzyme A 
transferase 


transcriptional regulator 


Matched 
length 
(aa) 


LO 
O 
CM 


co 


CM 

xr 




I s - 
? 


o 

CD 


CO 
CM 




LO 

o 

CO 


CM 


CO 
CO 


CD 
CM 


LO 
I s - 


o 
o 
xj- 




CO 
CN 




o 

LO 


CN 
CO 
















































































Similaril 
(%) 


CM 


o 


o 




CO 


CN 


o 




CO 


I s - 


T — 


xr 


O 


o 




CO 




CO 


LO 


LO 


CO 
CO 


LO* 




LO* 

I s - 


XT 

CO 


cb 

CO 




XT 
CO 


oS 
I s - 


LO 
CO 


CD 

I s - 


CO* 
XT 


CO 

I s - 




I s - 




LL 


CO 
CO 




cx> 


o 


o 




CO 


CO 


co 










CD 


o 


CO 




LO 




CO 


CD 




LO 
CM 


CO 






xr 
xr 


CO 
CO 


LO 

xr 




r--' 

LO 


t — 

CO 


CO 
CO 


CM 
LO 


CM 

xr 


CD* 
CO 




CO 
CO 




1^ 

XT 


CO 
CO 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv0089 


Chlamydia pneumoniae 


Chlamydia muridarum Nigg 
TC0129 




Acinetobacter calcoaceticus 
NCIB 8250 murA 


Mycobacterium tuberculosis 
H37Rv Rv1314c 


Streptomyces coelicolor A3(2) 
SC2G5.15c 




Bacillus subtilis 168 cysK 


Azotobactervinelandii cysE2 


Deinococcus radiodurans R1 
DR1844 


Coxiella burnetii Nine Mile Ph I 
sucD 


Aeropyrum pernix K1 APE1069 


Bacillus subtilis 168 sucC 




Streptomyces roseofulvus frnE 




Clostridium kiuyveri cat1 cat1 


Azospirillum brasilense ATCC 
29145 ntrC 


db Match 


sp:Y089_MYCTU 


GSP:Y35814 


PIR:F81737 




sp:MURA_ACICA 


sp:Y02Y_MYCTU 


gp:SC2G5_15 




Z> 

to 
o 
< 
m 

CO 

> 
o 

bL 

CO 


prf:2417357C 


gp:AE002024_10 


sp:SUCD_COXBU 


PIR:F72706 


sp:SUCC_BACSU 




gp:AF058302_5 




sp:CAT1_CLOKL 


sp:NIR3_AZOBR 


ORF 

(DPJ 


LO 
CM 
LO 


CO 

I s - 

CM 


xr 


LO 
CD 


1254 


o 

LO 


CO 

xr 

CO 


CO 

o 

xT 


xr 

CN 
CD 


CO 

xr 

LO 


CO 
CO 
CM 


CM 
CO 
CO 


LO 
CN 
CN 


1194 


o 

CO 
CO 


LO 
CO 
I s - 


CO 
CO 


1539 


1143 


Terminal 
(nt) 


2712374 


2713453 


2713842 


2717993 


2718436 


2720319 


2720385 


2721295 


2722857 


2723609 


2723770 


2724478 


2725843 


2725384 


2726786 


2727399 


2728207 


2729378 


2732518 


Initial 
(nt) 


2711850 


2713181 


2713702 


2718187 


2719689 


2719750 


2721227 


2721702 


2721934 


2723064 


2724057 


2725359 


2725619 


2726577 


2727145 


2728133 


2729025 


2730916 


2731376 


So - 


6309 


6310 


6311 


6312 


6313 


6314 


6315 


6316 


6317 


6318 


6319 


6320 


6321 


6322 


6323 


6324 


6325 


6326 


6327 


SEQ 
NO. 
(DNA) 


2809 


2810 


2811 


2812 


2813 


2814 


2815 


2816 


2817 


2818 


2819 


2820 


2821 1 


2822 


2823 


2824 


2825 


2826 


2827 



278 



Function 




phosphate transport system 
regulatory protein 


phosphate-specific transport 
component 


phosphate ABC transport system 
permease protein 


phosphate ABC transport system 
permease protein 


phosphate-binding protein S-3 
precursor 


acetyltransferase 




hypothetical protein 


hypothetical protein 


branched-chain amino acid 
aminotransferase 


hypothetical protein 


hypothetical protein 


S-phosphoribosyl-S-aminoimidazole 
synthetase 


amidophosphoribosyl transferase 


Matched 
length 
(a.a) 




CO 
CN 


LO 
LO 
CN 


CN 
CO 
CN 


LO 

CN 
CO 


CD 
CO 

CO 


LO 

T — 

CO 




co 


LO 
CN 
CN 


CD 
LO 
CN 


CN 
LO 
CO 


CO 
LO 


co 


CN 
CO 


Similarity 
(%) 




81.7 


82.8 


82.2 


78.5 


56.0 


60.0 




55.2 


74.2 


56.0 


79.0 


81.0 


94.2 


89.0 


Identity 
(%) 




46.5 


58.8 


51.4 


50.2 


40.0 


34.3 




24.7 


44.9 


28.6 


58.5 


58.6 


o 

CO 


70.3 


Homologous gene 




Mycobacterium tuberculosis 
H37Rv Rv0821cphoY-2 


Pseudomonas aeruginosa pstB 


Mycobacterium tuberculosis 
H37Rv Rv0830 pstA1 


Mycobacterium tuberculosis 
H37Rv Rv0829 pstC2 


Mycobacterium tuberculosis 
H37Rv phoS2 


Streptomyces coelicolor A3(2) 
SCD84.18C 




Bacillus subtilis 168 bmrU 


Mycobacterium tuberculosis 
H37Rv Rv0813c 


Solanum tuberosum BCAT2 


Corynebacterium 
ammoniagenes ATCC 6872 
ORF4 


Mycobacterium tuberculosis 
H37Rv Rv0810c 


Corynebacterium 
ammoniagenes ATCC 6872 
purM 


Corynebacterium 
ammoniagenes ATCC 6872 
purF 


db Match 




pir:E70810 


pir:S68595 


gp:MTPSTA1_1 


pir:A70584 


pir:H70583 


gp:SCD84_18 




sp:BMRU_BACSU 


pir:E70809 


gp:AF193846J 


gp:AB003158_6 


pir:B70809 


gp:AB003158_5 


gp:AB003158_4 


n 


o 

CO 


CN 
CO 


r-~- 

CO 


CN 

CO 


1014 


1125 


CD 

h- 

CO 


CO 
CO 

r-~ 


1095 


CO 
CO 


CN 
CD 


1101 


CO 
CN 


1074 


1482 


Terminal 
(nt) 


2731424 


2733367 


2733455 


2734264 


2735202 


2736414 


2737836 


2739553 


2739556 


2741356 : 


2741636 


2743785 


2744222 


2744881 


2746083 


Initial 
(nt) 


2732230 


2732636 


2734351 


2735184 


2736215 


2737538 


2738711 


2738771 


2740650 


2740670 


2742577 


2742685 


2744010 


2745954 


2747564 


SEQ 
NO. 
(a.a.) 


6328 


6329 


6330 


6331 


6332 


6333 


6334 


6335 


6336 


6337 


6338 


6339 


6340 


6341 


6342 


SEQ 

NO. 
(DNA) 


2828 


2829 


2830 


2831 


2832 


2833 


2834 


2835 


2836 


i 

2837 


2838 


2839 


2840 


2841 


2842 



27 9 



Function 


hypothetical protein 


hypothetical protein 


hypothetical membrane protein 


hypothetical protein 


5-phosphoribosyl-N- 
formylglycinamidine synthetase 




S'-phosphoribosyl-N- 
formylglycinamidine synthetase 


hypothetical protein 




gluthatione peroxidase 


extracellular nuclease 




hypothetical protein 


C4-dicarboxylate transporter 


dipeptidyl aminopeptidase 


Matched 
length 
(a.a) 


CM 


LO 
CO 


T — 

CM 


CM 


CO 
CO 

r-- 




CO 
CM 
CN 


CO 

r- 




CO 
LO 


LO 

CO 
CO 




CM 


T — 


co 

CO 


Similarity 
(%) 


75.8 


94.0 


87.1 


71.0 

i 


89.5 




93.3 


93.7 




77.9 


LO 
T — 
LO 




68.7 


81.6 


70.6 


Identity 
(%) 


57.3 


75.9 


67.7 


64.0 


77.6 




80.3 


81.0 




46.2 


O 
CO 
CN 




37.4 


49.0 


41.8 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv0807 


Corynebacterium 
ammoniagenes ATCC 6872 
ORF2 


Corynebacterium 
ammoniagenes ATCC 6872 
ORF1 


Sulfolobus solfataricus 


Corynebacterium 
ammoniagenes ATCC 6872 
I purL 




Corynebacterium 
ammoniagenes ATCC 6872 
purQ 


Corynebacterium 
ammoniagenes ATCC 6872 
purorf 




Lactococcus lactis gpo 


Aeromonas hydrophila JMP636 
nucH 




Mycobacterium tuberculosis 
H37Rv Rv0784 


Salmonella typhimurium LT2 
dctA 


Pseudomonas sp. W024 dapbl 


db Match 


pir:H70536 


gp:AB003158_2 


gp:AB003158J 


GP:SSU18930 21 
4 


gp:AB003162_3 




gp:AB003162_2 


gp:AB003162J 




prf:2420329A 


prf:2216389A 




pir:C70709 


sp:DCTA_SALTY 


prf:2408266A 


SI 


LO 

h- 

CO 


1017 


r-- 


CO 
CO 


2286 


o 

CN 
1^- 


CO 
CO 
CO 


CO 
CN 


CN 
CM 
LO 


r^- 


2748 


CO 
CM 


CO 
CO 


1338 


2118 


Terminal 
(nt) 


2747683 


2749111 


2749162 


2752103 


2750027 


2753121 


2752327 


2752995 


2753819 


2753328 


2756739 


2757126 


2757129 


2757863 


2759532 


Initial 
(nt) 


2748057 


2748095 


2749902 


2751918 


2752312 


2752402 


2752995 


2753237 


2753298 


2753804 


2753992 


2756851 


2757815 


2759200 


2761649 


SEQ 
NO. 
(a.a.) 


6343 


6344 


6345 


6346 


6347 


6348 


6349 


6350 


6351 


6352 


6353 


6354 


6355 


6356 


6357 


SEQ 

NO. 
(DNA) 


2843 


2844 
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2846 
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2850 | 
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2852 
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Function 


peptide synthase 




phenylacetaldehyde dehydrogenase 


hypothetical protein 


hypothetical protein 


hypothetical protein 


heat shock protein or chaperon or 
groEL protein 














hypothetical protein 






peptidase 






Na+/H+ antiporter or multiple 
resistance and pH regulation related 
protein A or NADH dehydrogenase 


Matched 
length 
(aa) 


1241 




CO 
CO 

Tj- 


CN 


TT 
LO 


CO 


CO 
LO 














1236 






Tf 
TT 






r- 
co 


Similarity 
(%) 


51.6 




63.7 


79.7 


63.0 


80.0 


100.0 














42.3 






68.0 






68.3 


Identity 
(%) 


28.4 




35.0 


57.3 


62.0 


74.0 


99.5 














21.7 






37.1 






35.6 


Homologous gene 


Streptomyces roseosporus cpsB 




Escherichia coli K12 padA 


Campylobacter jejuni CJ0604 


Mycobacterium tuberculosis 


Mycobacterium tuberculosis 


Brevibacterium flavum MJ-233 














Homo sapiens MUC5B 






Mycobacterium tuberculosis 
H37Rv Rv2522c 






Staphylococcus aureus mnhA 


db Match 


prf:2413335A 




prf:2310295A 


gp:CJ1 1168X2 25 
4 


GRMSGTCWPAJ 


GP:MSGTCWPA_1 


gsp:R94368 














prf;2309326A 






pir:G70870 






prf:2504285B 


SI 


3885 


1461 


1563 


CO 
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CN 
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CO 
CD 
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3591 
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CM 
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CD 
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Terminal 
(nt) 


2884882 


2881844 


2884935 


2886916 


2890346 


2890553 


2888897 


2890751 


2890930 


2892138 


2893100 


2895072 


2897528 


2900330 


2903964 


2906639 


2908885 


2909788 


2909231 


2913228 


Initial 
(nt) 


2880998 


2883304 


2886497 


2887833 


2890185 


2890377 


2890540 


2890930 


2892138 


2893100 


2895085 


2897525 


2900326 


2903920 


2906738 


2907250 


2907515 


2909210 


2909830 


2910172 
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NO. 
(a.a.) 
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6489 
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6491 i 
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6498 


6499 
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NO. 
(DNA) 
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2989 
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Function 




membrane transport protein or 
bicyclomycin resistance protein 


sodium dependent phosphate pump 


phenazine biosynthesis protein 




ABC transporter 


ABC transporter ATP-binding protein 


mutator mutT protein 


hypothetical membrane protein 


glutamine-binding protein precursor 


serine/threonine kinase 




ferredoxin/ferredoxin-NADP 
reductase 


acetyltransferase (GNAT) family 








phosphoribosylglycinamide 
formyltransferase 




Matched 
length 
(aa) 
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CO 
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CO 
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LO 




Homologous gene 




Escherichia coli K12 bcr 


Vibrio cholerae JS1569 nptA 


Pseudomonas aureofaciens 30- 
84 phzC 




Streptomyces coelicolor A3(2) . 
SCE8.16C 


Bacillus licheniformis ATCC 
9945A bcrA 


Mycobacterium tuberculosis 
H37Rv Rv0413 


Mycobacterium tuberculosis 
H37Rv Rv0412c 


Bacillus stearothermophilus 
NUB36 glnH 


Mycobacterium tuberculosis 
H37Rv Rv0410c pknG 




Bostaurus 


Escherichia coli K12 elaA 








Bacillus subtilis 168 purT 




db Match 




BCR_ECOLI 


VCAJ10968J 


a: 
< 

LU 
CO 
CL 

o' 

NJ 
X 
CL 




SCE8J6 


BCRA_BACLI 


C70629 


B70629 


GLNH_BACST 


H70628 




> 
o 
m 

i 

O 
oc 

Q 
< 


ELAA_ECOLI 








PURT_BACSU 








:ds 


cL 

CO 


sp: 




CL 
CD 


:ds 


pir: 


pir. 


ds 


CL 
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ds 








sp: 






lo 

CD 


1194 
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CO 


CO 
CO 
CD 


CO 
CD 
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CD 
CO 
CD 
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LO 


1386 


1032 


2253 




1365 


CO 
LO 


1062 
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CO 
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CO 


1194 


CO 
CO 
CO 


Terminal 
(nt) 


2924844 


2923954 


2926704 


2926707 


2927651 


2927551 


2928302 


2929256 


2931336 


2932371 


2934829 


2932652 


2939767 


2940452 


2940447 


2941472 


2942609 


2943012 


2945639 


Initial 
(nt) 


2924191 


2925147 


2925541 


2927546 


2928283 


2928318 


2929237 


2929756 


2929951 


2931340 


2932577 


2933398 


2938403 


2939907 


2941508 


2942500 


2943007 


2944205 


2946526 


SEQ 

NO. 
(a.a.) 


6515 


6516 


6517 


6518 


6519 


6520 


6521 


6522 


6523 


6524 


6525 


6526 


6527 


6528 


6529 


6530 


6531 


6532 


6533 


SEQ 
NO. 
(DNA) 
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3018 


3019 


3020 
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3025 
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3033 
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Function 


hypothetical membrane protein 


hypothetical membrane protein 


propionyl-CoA carboxylase complex 
B subunit 


polyketide synthase 


acyl-CoA synthase 


hypothetical protein 




major secreted protein PS1 protein 
precursor 






antigen 85-C 


hypothetical membrane protein 


nodulation protein 


hypothetical protein 


hypothetical protein 




phosphatidic acid phosphatase 


Matched 
length 
(aa) 


CD 
CO 
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CO 
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CD 
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>* 




































Similarii 
(%) 


62.9 


69.4 


76.9 


54.2 


62.3 


67.4 




99.5 






62.5 


61.2 


51.5 


75.0 


I s - 




56.5 


Identity 
(%) 


29.1 


34.3 


49.7 


30.2 


33.5 


39.8 




98.6 






36.3 


37.5 


27.1 


CN 
LO 


55.6 




28.2 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv0204c 


Mycobacterium tuberculosis 
H37Rv Rv0401 


Streptomyces coelicolor A3(2) 
pccB 


Streptomyces erythraeus eryA 


Mycobacterium bovis BCG 


Mycobacterium tuberculosis 
H37Rv Rv3802c 




Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
17965 cop1 






Mycobacterium tuberculosis 
ERDMANN RV0129C fbpC 


Mycobacterium tuberculosis 
H37Rv Rv3805c 


Azorhizobium caulinodans 
ORS571 noeC 


Mycobacterium tuberculosis 
H37Rv Rv3807c 


Mycobacterium tuberculosis 
H37Rv Rv3808c 




Bacillus licheniformis ATCC 
9945A bcrC 


db Match 


pir:A70839 


pir:H70633 


gp:AF113605_1 


a: 
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o 
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CO 
LU 

CL 

to 


prf:2310345A 


pir:F70887 




sp:CSP1_CORGL 






sp:A85C_MYCTU 


pir:A70888 


sp:NOEC_AZOCA 


pir:C70888 


pir:D70888 




sp:BCRC_BACLI 
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1494 
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(nt) 


3060733 


3061095 


3061380 


3062951 


3068143 


3070214 


3071147 


3071650 


3075447 


3073857 


3075540 


3076715 


3078853 


3079848 


3080344 


3083960 


3083935 


Initial 
(nt) 


3059651 


3060733 


3062927 


3067780 


3069930 


3071140 


3071644 


3073620 


3074047 


3074075 


3076562 


3078772 ' 


3079848 


3080351 


3082311 


3082467 


3084411 
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NO. 
(a.a.) 
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6675 
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6678 


6679 


co ^ a 


3163 


3164 


3165 


3166 


3167 


3168 


3169 


3170 


3171 


3172 


3173 


3174 


3175 


3176 


3177 


3178 


3179 
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Function 






dimethylaniline monooxygenase (N- 
oxide-forming) 




UDP-galactopyranose mutase 


hypothetical protein 


glycerol kinase 


hypothetical protein 


acyltransferase 


seryl-tRNA synthetase 


transcriptional regulator, GntR family 
or fatty acyl-responsive regulator 


hypothetical protein 


hypothetical protein 




2,3-PDG dependent 
phosphoglycerate mutase 




nicotinamidase or pyrazinamidase 
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Homologous gene 






Sus scrofa fmol 




Escherichia coli K12 glf 


Mycobacterium tuberculosis 
H37Rv Rv3811 csp 


Pseudomonas aeruginosa 
ATCC 15692 glpK 


Mycobacterium tuberculosis 
H37Rv Rv3813c 


Mycobacterium tuberculosis 
H37Rv Rv3816c 


Mycobacterium tuberculosis 
H37Rv 


Escherichia coli K12 farR 


Mycobacterium tuberculosis 
H37Rv Rv3835 


Mycobacterium tuberculosis 
H37Rv Rv3836 




Amycolatopsis methanolica pgm 




Mycobacterium smegmatis pzaA 




db Match 






sp:FM01_PIG 




sp:GLF_ECOLI 


pir;G70520 


sp:GLPK_PSEAE 


pir:A70521 


pir:D70521 


gsp:W26465 


sp:FARR_ECOLI 
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gp:AMU73808_1 
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Function 


L-lactate dehydrogenase or FMN- 
dependent dehydrogenase 




immunity repressor protein 






phosphatase or reverse 
transcriptase (RNA-dependent) 




peptidase or lAA-amino acid 
hydrolase 




peptide methionine sulfoxide 
reductase 


superoxide dismutase (Fe/Mn) 


transcriptional regulator 


multidrug resistance transporter 








hypothetical protein 


membrane transport protein 


transcriptional regulator 


two-component system response 
regulator 
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cytchrome b6-F complex iron-sulfur 
subunit (Rieske iron-sulfur protein) 


NADH oxidase or NADH-dependent 
flavin oxidoreductase 


hypothetical membrane protein 


hypothetical protein 


bacterial regulatory protein, arsR 
family or methylenomycin A 
resistance protein 


NADH oxidase or NADH-dependent 
flavin oxidoreductase 


hypothetical protein 










acetoin(diacetyl) reductase (acetoin 
dehydrogenase) 


hypothetical protein 


di-/tripeptide transpoter 




bacterial regulatory protein, tetR 
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Homologous gene 


Chlorobium limicola petC 
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nadO 
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Streptomyces coelicolor A3(2) 
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Streptomyces coelicolor Plasmid 
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maleylacetate reductase 


sugar transporter or D-xylose-proton 
symporter (D-xylose transporter) 


bacterial transcriptional regulator or 
acetate operon repressor 


oxidoreductase 


diagnostic fragment protein 
sequence 


myo-inositol 2-dehydrogenase 


dehydrogenase or myo-inositol 2- 
dehydrogenase or streptomycin 
biosynthesis protein 


phosphoesterase 








stomatin 




DEAD box RNA helicase family 


hypothetical membrane protein 
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mercuric ion-binding protein or 
heavy-metal-associated domain 
containing protein 
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Function 




thioredoxin ch2, M-type 


N-acetylmuramoyl-L-alanine 
amidase 






hypothetical protein 


hypothetical protein 


partitioning or sporulation protein 


glucose inhibited division protein B 


hypothetical membrane protein 


ribonuclease P protein component 


SOS ribosomal protein L34 






L-aspartate-alpha-decarboxylase 
precursor 


2-isopropylmalate synthase 


hypothetical protein 


aspartate-semialdehyde 
dehydrogenase 


3-dehydroquinase 


Matched 
length 
(aa) 




CD 


CD 
CD 






CN 
CN 


CO 
CO 


CN 
CN 


CO 
LO 


CO 
x — 
CO 


CO 
CN 








CD 
CO 

x — 


CD 
CD 


LO 
CO 


CO 


CO 

T — 








































milarity 
(%) 




76.5 


LD* 






58.5 


60.5 


78.0 


64.7 


75.4 


59.4 


93.6 






100.0 


100.0 


100.0 


100.0 


100.0 


CO 








































Identity 
(%) 




42.0 


51.0 






CO 


37.6 


65.0 


36.0 


44.7 


26.8 


83.0 






100.0 


O'OOl 


0*001* 


100.0 


100.0 


Homologous gene 




Chlamydomonas reinhardtii thi2 


Bacillus subtilis cwlB 






Mycobacterium tuberculosis 
H37Rv Rv3916c 


Pseudomonas putida ygi2 


Mycobacterium tuberculosis 
H37Rv parB 


Escherichia coli K12 gidB 


Mycobacterium tuberculosis 
H37Rv Rv3921c 


Bacillus subtilis rnpA 


Mycobacterium avium rpmH 






Corynebacterium glutamicum 
panD 


Corynebacterium glutamicum 
ATCC 13032 leuA 


Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
13032 orfX 


Corynebacterium glutamicum 
asd 


Corynebacterium glutamicum 
AS019aroD 


db Match 




sp:THI2_CHLRE 


sp:CWLB_BACSU 






pir:D70851 


ZD 
CL 
LU 
CO 
Cl 

1 

CN 

O 
> 
cL 

CO 


sp:YGI1_PSEPU 


sp:GIDB_ECOLI 


pir:A70852 


sp:RNPA_BACSU 


gp:MAU19185_1 






gp:AF116184J 


_i 
CD 
QL 
O 
O 

1 

LU 
— 1 

CL 
CO 


sp:YLEU_CORGL 


sp:DHAS_CORGL 


gp:AF124518_1 


SI 


1185 


CN 
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1041 


CO 

to 
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h- 
CO 
CO 
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CD 
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CD 


CD 
CD 
CO 
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CO 
CO 


co 

CN 


CN 
CN 
CN 


CO 
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LO 
LO 
CN 


1032 




Terminal 
(nt) 


3300119 


3301729 


3302996 


3301989 


3304475 


3302999 


3303636 


3304835 


3305864 


3306682 


3307971 


3308412 


3309321 


3308822 


147573 


266154 


268814 


271691 


446521 


Initial 
(nt) 


3301303 


3301358 


3301755 


3302765 


3303435 


3303616 


3304787 


3305671 


3306532 


3307632 


3308369 


3308747 


3309028 


3309043 


147980 


268001 


269068 


270660 


446075 


SEQ 
NO. 
(a.a.) 


6918 


6919 


6920 


6921 


6922 
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6924 


6925 
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6928 


6929 I 


6930 


6931 


6932 


6933 


6934 


6935 


6936 


SEQ 
NO. 
(DNA) 


3418 
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3421 


3422 


3423 


3424 


3425 


3426 


3427 


3428 


3429 ! 


3430 


3431 


3432 


3433 


3434 


3435 


3436 
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Example 2 

Determination of effective mutation site 

(1) Identification of mutation site based on the comparison 
of the gene nucleotide sequence of lysine-producing B-6 
strain with that of wild type strain ATCC 13032 

Corynebacterlum glutamlcum B-6, which is resistant 
to S- (2-aminoethyl) cysteine (AEC) , rifampicin, streptomycin 
and 6-azauracil, is a lysine-producing mutant having been 
mutated and bred by subjecting the wild type ATCC 13032 
strain to multiple rounds of random mutagenesis with a 
mutagen, N-methyl-N' -ni tro-N-nitrosoguanidine (NTG) and 
screening (Appl. Microbiol. Blotechnol. , 32: 269-273 
(1989) ) . First , the nucleotide sequences of genes derived 
from the B-6 strain and considered to relate to the lysine 
production were determined by a method similar to the above 
The genes relating to the lysine production include lysE 
and lysG which are lysine-excreting genes; dLdh, dapA, horn 
and lysC (encoding diaminopimelate dehydrogenase, 
dihydropicolinate synthase, homoserine dehydrogenase and 
aspartokinase , respectively) which are lysine-biosynthetic 
genes; and pyc and zwf (encoding pyruvate carboxylase and 
glucose-6-phosphate dehydrogenase, respectively) which are 
glucose-metabolizing genes. The nucleotide sequences of 
the genes derived from the production strain were compared 
with the corresponding nucleotide sequences of the ATCC 
13032 strain genome represented by SEQ ID NOS : 1 to 3501 and 
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analyzed. As a result, mutation points were observed in 
many genes. For example, no mutation site was observed in 
lysE, lysG f ddh, dapA, and the like, whereas amino acid 
replacement mutations were found in horn, lysC, pyc, zwf, 
and the like. Among these mutation points, those which are 
considered to contribute to the production were extracted 
on the basis of known biochemical or genetic information. 
Among the mutation points thus extracted, a mutation, 
Val59Ala, in horn and a mutation, Pro458Ser, in pyc were 
evaluated whether or not the mutations were effective 
according to the following method. 

(2) Evaluation of mutation, Val59Ala, in horn and mutation, 
Pro458Ser, in pyc 

It is known that a mutation in horn inducing 
requirement or partial requirement for homoserine imparts 
lysine productivity to a wild type strain (Amino Acid 
Fermentation, ed. by Hiroshi Aida et al. , Japan Scientific 
Societies Press) . However, the relationship between the 
mutation, Val59Ala, in horn and lysine production is not 
known. It can be examined whether or not the mutation, 
Val59Ala, in horn is an effective mutation by introducing 
the mutation to the wild type strain and examining the 
lysine productivity of the resulting strain. On the other 
hand, it can be examined whether or not the mutation, 
Pro458Ser, in pyc is effective by introducing this mutation 
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into a lysine-producing strain which has a deregulated 
lysine-bioxynthetic pathway and is free from the pyc 
mutation, and comparing the lysine productivity of the 
resulting strain with the parent strain. As such a lysine- 
producing bacterium, No. 58 strain (FERM BP-7134) was 
selected (hereinafter referred to the "lysine-producing 
No. 58 strain" or the "No. 58 strain") . Based on the above, 
it was determined that the mutation, Val59Ala, in horn and 
the mutation, Pro458Ser, in pyc were introduced into the 
wild type strain of Corynebacterlum glutamlcvm ATCC 13032 
(hereinafter referred to as the "wild type ATCC 13032 
strain" or the "ATCC 13032 strain") and the lysine- 
producing No, 58 strain, respectively, using the gene 
replacement method. A plasmid vector pCES30 for the gene 
replacement for the introduction was constructed by the 
following method. 

A plasmid vector pCE53 having a kanamycin-resistant 
gene and being capable of autonomously replicating in 
Coryneform bacteria (Mol. Gen. Genet., 196: 175-178 (1984)) 
and a plasmid pMOB3 (ATCC 77282) containing a levansucrase 
gene (sacB) of Bacillus subtllls (Molecular Microbiology, 
6: 1195-1204 (1992)) were each digested with PstI . Then, 
after agarose gel electrophoresis, a pCE53 fragment and a 
2 . 6 kb DNA fragment containing sacB were each extracted and 
purified using GENE CLEAN Kit (manufactured by BIO 101) . 
The pCE53 fragment and the 2 . 6 kb DNA fragment were ligated 
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using Ligation Kit ver. 2 (manufactured by Takara Shuzo) , 
introduced into the ATCC 13032 strain by the 
electroporation method (FE24S Microbiology Letters, 65: 299 
(1989) ) , and cultured on BYG agar medium (medium prepared 
by adding 10 g of glucose, 20 g of peptone (manufactured by 
Kyokuto Pharmaceutical) , 5 g of yeast extract (manufactured 
by Difco) , and 16 g of Bactoagar (manufactured by Difco) to 
1 liter of water, and adjusting its pH to 7.2) containing 
25 [ig/ml kanamycin at 30°C for 2 days to obtain a 
transformant acquiring kanamycin-resistance . As a result 
of digestion analysis with restriction enzymes, it was 
confirmed that a plasmid extracted from the resulting 
transformant by the alkali SDS method had a structure in 
which the 2.6 kb DNA fragment had been inserted into the 
PstI site of pCE53. This plasmid was named pCES30. 

Next, two genes having a mutation point, horn and 
pyc, were amplified by PCR, and inserted into pCES30 
according to the TA cloning method (Bio Experiment 
Illustrated vol. 3, published by Shujunsha) . Specifically, 
pCES30 was digested with BamHI (manufactured by Takara 
Shuzo), subjected to an agarose gel electrophoresis, and 
extracted and purified using GENE CLEAN Kit (manufactured by 
BIO 101) . The both ends of the resulting pCES30 fragment 
were blunted with DNA Blunting Kit (manufactured by Takara 
Shuzo) according to the attached protocol . The blunt-ended 
pCES30 fragment was concentrated by extraction with 
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phenol /chloroform and precipitation with ethanol , and 
allowed to react in the presence of Taq polymerase 
(manufactured by Roche Diagnostics) and dTTP at 70°C for 2 
hours so that a nucleotide, thymine (T) , was added to the 
3' -end to prepare a T vector of pCES30. 

Separately, chromosomal DNA was prepared from the 
lysine-producing B-6 strain according to the method of 
Saito et al. (Blochem. Blophys. Acta, 72: 619 (1963)). 
Using the chromosomal DNA as a template, PCR was carried 
out with Pfu turbo DNA polymelase (manufactured by 
Stratagene) . In the mutated horn gene, the DNAs having the 
nucleotide sequences represented by SEQ ID NOS:7002 and 
7003 were used as the primer set. In the mutated pyc gene, 
the DNAs having the nucleotide sequences represented by SEQ 
ID NOS:7004 and 7005 were used as the primer set. The 
resulting PCR product was subjected to agarose gel 
electrophoresis, and extracted and purified using GENE GLEAN 
Kit (manufactured by BIO 101) . Then, the PCR product was 
allowed to react in the presence of Taq polymerase 
(manufactured by Roche Diagnostics) and dATP at 72°C for 10 
minutes so that a nucleotide, adenine (A) , was added to the 
3' -end. 

The above pCES30 T vector fragment and the mutated 
horn gene (1.7 kb) or mutated pyc gene (3.6 kb) to which the 
nucleotide A had been added of the PCR product were 
concentrated by extraction with phenol /chloroform and 
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precipitation with ethanol , and then ligated using Ligation 
Kit ver. 2. The ligation products were introduced into the 
ATCC 13032 strain according to the electroporation method, 
and cultured on BYG agar medium containing 25 |j.g/ml 
kanamycin at 30°C for 2 days to obtain kanaraycin-resistant 
transf ormants . Each of the resulting transf ormants was 
cultured overnight in BYG liquid medium containing 25 jig/ml 
kanamycin, and a plasmid was extracted from the culturing 
solution medium according to the alkali SDS method. As a 
result of digestion analysis using restriction enzymes, it 
was confirmed that the plasmid had a structure in which the 
1.7 kb or 3.6 kb DNA fragment had been inserted into pCES30. 
The plasmids thus constructed were named respectively 
pChom59 and pCpyc458. 

The introduction of the mutations to the wild type 
ATCC 13032 strain and the lysine-producing No. 58 strain 
according to the gene replacement method was carried out 
according to the following method. Specifically, pChom59 
and pCpyc458 were introduced to the ATCC 13032 strain and 
the No. 58 strain, respectively, and strains in which the 
plasmid is integrated into the chromosomal DNA by 
homologous recombination were selected using the method of 
Ikeda et al. (Microbiology 144: 1863 (1998)). Then, the 
stains in which the second homologous recombination was 
carried out were selected by a selection method, making use 
of the fact that the Bacillus subtilis levansucrase encoded 
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by pCES30 produced a suicidal substance (*J. of Bacterial. , 
174: 5462 (1992)). Among the selected strains, strains in 
which the wild type horn and pyc genes possessed by the ATCC 
13032 strain and the No. 58 strain were replaced with the 
mutated horn and pyc genes, respectively, were isolated. 
The method is specifically explained below. 

One strain was selected from the transf ormants 
containing the plasmid, pChom59 or pCpyc458, and the 
selected strain was cultured in BYG medium containing 20 
[ig/ml kanamycin, and pCGll (Japanese Published Examined 
Patent Application No. 91827/94) was introduced thereinto 
by the electroporation method. pCGll is a plasmid vector 
having a spectinomycin-resistant gene and a replication 
origin which is the same as pCE53. After introduction of 
the pCGll, the strain was cultured on BYG agar medium 
containing 20 (ig/ml kanamycin and 100 pig/ml spectinomycin 
at 30°C for 2 days to obtain both the kanamycin- and 
spectinomycin-resistant transf ormant . The chromosome of 
one strain of these transf ormants was examined by the 
Southern blotting hybridization according to the method 
reported by Ikeda et al. (Microbiology, 144: 1863 (1998)). 
As a result, it was confirmed that pChom59 or pCpyc458 had 
been integrated into the chromosome by the homologous 
recombination of the Cambell type. In such a strain, the 
wild type and mutated horn or pyc genes are present closely 
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on the chromosome, and the second homologous recombination 
is liable to arise therebetween. 

Each of" these transformants (having been recombined 
once) was spread on Sue agar medium (medium prepared by 
adding 100 g of sucrose, 7 g of meat extract, 10 g of 
peptone, 3 g of sodium chloride, 5 g of yeast extract 
(manufactured by Difco) , and 18 g of Bactoagar 
(manufactured by Difco) to 1 liter of water, and adjusting 
its pH 7.2) and cultured at 30°C for a day. Then the 
colonies thus growing were selected in each case. Since a 
strain in which the sacB gene is present converts sucrose 
into a suicide substrate, it cannot grow in this medium (J. 
Bacterlol. , 174: 5462 (1992)). On the other hand, a strain 
in which the sacB gene was deleted due to the second 
homologous recombination between the wild type and the 
mutated horn or pyc genes positioned closely to each other 
forms no suicide substrate and, therefore, can grow in this 
medium. In the homologous recombination, either the wild 
type gene or the mutated gene is deleted together with the 
sacB gene. When the wild type is deleted together with the 
sslcB gene, the gene replacement into the mutated type 
arises . 

Chromosomal DNA of each the thus obtained second 
recombinants was prepared by the above method of Saito et 
al. PCR was carried out using Pfu turbo DNA polymerase 
(manufactured by Stratagene) and the attached buffer. In 
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the horn gene, DNAs having the nucleotide sequences 
represented by SEQ ID NOS:7002 and 7003 were used as the 
primer set. Also, in the pyc gene was used, DNAs having 
the nucleotide sequences represented by SEQ ID NOS:7004 and 
7005 were used as the primer set. The nucleotide sequences 
of the PCR products were determined by the conventional 
method so that it was judged whether the horn or pyc gene of 
the second recombinant was a wild type or a mutant. As a 
result, the second recombinant which were called HD-1 and 
No. 58pyc were target strains having the mutated horn gene 
and pyc gene, respectively. 

(3) Lysine production test of HD-1 and No. 58pyc strains 

The HD-1 strain (strain obtained by incorporating 
the mutation, Val59Ala, in the horn gene into the ATCC 13032 
strain) and the No. 58pyc strain (strain obtained by 
incorporating the mutation, Pro458Ser, in the pyc gene into 
the lysine-producing No. 58 strain) were subjected to a 
culture test in a 5 1 jar fermenter by using the ATCC 13032 
strain and the lysine-producing No. 58 strain respectively 
as a control. Thus lysine production was examined. 

After culturing on BYG agar medium at 30°C for 24 
hours, each strain was inoculated into 250 ml of a seed 
medium (medium prepared by adding 50 g of sucrose, 40 g of 
corn steep liquor, 8.3 g of ammonium sulfate, 1 g of urea, 
2 g of potassium dihydrogenphosphate , 0.83 g of magnesium 
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sulfate heptahydrate, 10 mg of iron sulfate heptahydrate, 1 
mg of copper sulfate pentahydrate, 10 mg of zinc sulfate 
heptahydrate, 10 mg of p-alanine, 5 mg of nicotinic acid, 
1.5 mg of thiamin hydrochloride, and 0.5 mg of biotin to 1 
liter of water, and adjusting its pH to 7.2, then to which 
30 g of calcium carbonate had been added) contained in a 2 
1 buf f le-attached Erlenmeyer flask and cultured therein at 
30°C for 12 to 16 hours. A total amount of the seed 
culturing medium was inoculated into 1,400 ml of a main 
culture medium (medium prepared by adding 60 g of glucose, 
20 g of corn steep liquor, 25 g of ammonium chloride, 2.5 g 
of potassium dihydrogenphosphate , 0.75 g of magnesium 
sulfate heptahydrate, 50 mg of iron sulfate heptahydrate, 
13 mg of manganese sulfate pentahydrate , 50 mg of calcium 
chloride, 6.3 mg of copper sulfate pentahydrate, 1.3 mg of 
zinc sulfate heptahydrate, 5 mg of nickel chloride 
hexahydrate, 1.3 mg of cobalt chloride hexahydrate, 1.3 mg 
of ammonium molybdenate tetrahydrate , 14 mg of nicotinic 
acid, 23 mg of p-alanine, 7 mg of thiamin hydrochloride, 
and 0.42 mg of biotin to 1 liter of water) contained in a 5 
1 jar fermenter and cultured therein at 32°C, 1 wm and 800 
rpm while controlling the pH to 7.0 with aqueous ammonia. 
When glucose in the medium had been consumed, a glucose 
feeding solution (medium prepared by adding 4 00 g glucose 
and 45 g of ammonium chloride to 1 liter of water) was 
continuously added. The addition of feeding solution was 



- 327 - 



carried out at a controlled speed so as to maintain the 
dissolved oxygen concentration within a range of 0.5 to 3 
ppm. After culturing for 29 hours, the culture was 
terminated. The cells were separated from the culture 
medium by centrifugation and then L-lysine hydrochloride in 
the supernatant was quantified by high performance liquid 
chromatography (HPLC) . The results are shown in Table 2 
below. 

Table 2 

Strain L-Lysine hydrochloride yield (g/1) 

ATCC 13032 0 
HD-1 8 
No. 58 45 
No. 58pyc 51 



As is apparent from the results shown in Table 2, 
the lysine productivity was improved by introducing the 
mutation, Val59Ala, in the horn gene or the mutation, 
Pro458Ser, in the pyc gene. Accordingly, it was found that 
the mutations are both effective mutations relating to the 
production of lysine. Strain, AHP-3, in which the mutation, 
Val59Ala, in the horn gene and the mutation, Pro458Ser, in 
the pyc gene have been introduced into the wild type ATCC 
13032 strain together with the mutation, Thr331Ile in the 
lysC gene has been deposited on December 5, 2000, in 
National Institute of Bioscience and Human Technology, 
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Agency of Industrial Science and Technology (Higashi 1-1-3, 
Tsukuba-shi, Ibaraki , Japan) as FERM BP-7382 . 



Example 3 

Reconstruction of* lysine-producing strain based on genome 
information 

The lysine-producing mutant B-6 strain (Appl. 
Microbiol. Bio techno 1. , 32: 269-273 (1989)), which has been 
constructed by multiple round random mutagenesis with NTG 
and screening from the wild type ATCC 13032 strain, 
produces a remarkably large amount of lysine hydrochloride 
when cultured in a jar at 32°C using glucose as a carbon 
source. However, since the fermentation period is long, 
the production rate is less than 2.1 g/l/h. Breeding to 
reconstitute only effective mutations relating to the 
production of lysine among the estimated at least 300 
mutations introduced into the B-6 strain in the wild type 
ATCC 13032 strain was performed. 

(1) Identification of mutation point and effective mutation 
by comparing the gene nucleotide sequence of the B-6 strain 
with that of the ATCC 13032 strain 

As described above, the nucleotide sequences of 
genes derived from the B-6 strain were compared with the 
corresponding nucleotide sequences of the ATCC 13032 strain 
genome represented by SEQ ID NOS : 1 to 3501 and analyzed to 
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identify many mutation points accumulated in the chromosome 
of the B-6 strain. Among these, a mutation, Val591Ala, in 
horn, a mutation, Thr311Ile, in lysC, a mutation, Pro458Ser, 
in pyc and a mutation, Ala213Thr, in zvf were specified as 
effective mutations relating to the production of lysine. 
Breeding to reconstitute the 4 mutations in the wild type 
strain and for constructing of an industrially important 
lysine-producing strain was carried out according to the 
method shown below. 

(2) Construction of plasmid for gene replacement having 
mutated gene 

The plasmid for gene replacement, pChom59, having 
the mutated horn gene and the plasmid for gene replacement, 
pCpyc458, having the mutated pyc gene were prepared in the 
above Example 2(2). Plasmids for gene replacement having 
the mutated lysC and zwf were produced as described below. 

The lysC and zwf having mutation points were 
amplified by PCR, and inserted into a plasmid for gene 
replacement, pCES30, according to the TA cloning method 
described in Example 2(2) (Bio Experiment Illustrated, Vol. 
3) . 

Separately, chromosomal DNA was prepared from the 
lysine-producing B-6 strain according to the above method 
of Saito et al. Using the chromosomal DNA as a template, 
PCR was carried out with Pfu turbo DNA polymerase 
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(manufactured by Stratagene) . In the mutated lysC gene, 
the DNAs having the nucleotide sequences represented by SEQ 
ID NOS:7006 and 7007 were used as the primer set. In the 
mutated zvf gene, the DNAs having the nucleotide sequences 
represented by SEQ ID NOS:7008 and 7009 as the primer set. 
The resulting PCR product was subjected to agarose gel 
electrophoresis, and extracted and purified using GENE GLEAN 
Kit (manufactured by BIO 101) . Then, the PCR product was 
allowed to react in the presence of Taq DNA polymerase 

(manufactured by Roche Diagnostics) and dATP at 72°C for 10 
minutes so that a nucleotide, adenine (A) , was added to the 
3' -end. 

The above pCES30 T vector fragment and the mutated 
lysC gene (1.5 kb) or mutated zwf gene (2.3 kb) to which 
the nucleotide A had been added of the PCR product were 
concentrated by extraction with phenol /chloroform and 
precipitation with ethanol , and then ligated using Ligation 
Kit ver. 2. The ligation products were introduced into the 
ATCC 13032 strain according to the electroporation method, 
and cultured on BYG agar medium containing 25 (ig/ml 
kanamycin at 30°C for 2 days to obtain kanamycin-resistant 
transf ormants . Each of the resulting transf ormants was 
cultured overnight in BYG liquid medium containing 25 (ig/ml 
kanamycin, and a plasmid was extracted from the culturing 
solution medium according to the alkali SDS method. As a 
result of digestion analysis using restriction enzymes, it 
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was confirmed that the plasmid had a structure in which the 
1.5 kb or 2.3 kb DNA fragment had been inserted into pCES30 . 
The plasmids thus constructed were named respectively 
pClysC311 and pCzwf213. 

(3) Introduction of mutation, Thr311Ile, in lysC into one 
point mutant HD-1 

Since the one mutation point mutant HD-1 in which 
the mutation, Val59Ala, in horn was introduced into the wild 
type ATCC 13032 strain had been obtained in Example 2(2), 
the mutation, Thr311Ile, in lysC was introduced into the 
HD-1 strain using pClysC311 produced in the above (2) 
according to the gene replacement method described in 
Example 2(2). PCR was carried out using chromosomal DNA of 
the resulting strain and, as the primer set, DNAs having 
the nucleotide sequences represented by SEQ ID NOS:7006 and 
7007 in the same manner as in Example 2(2). As a result of 
the fact that the nucleotide sequence of the PCR product 
was determined in the usual manner, it was confirmed that 
the strain which was named AHD-2 was a two point mutant 
having the mutated lysC gene in addition to the mutated horn 
gene . 
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(4) Introduction of mutation, Pro458Ser, in pyc into two 
point mutant AHD-2 

The mutation, Pro458Ser, in pyc was introduced into 
the AHD-2 strain using the pCpyc458 produced in Example 
2 (2) by the gene replacement method described in Example 
2(2) . PGR was carried out using chromosomal DNA of the 
resulting strain and, as the primer set, DNAs having the 
nucleotide sequences represented by SEQ ID NOS:7004 and 
7005 in the same manner as in Example 2(2). As a result of 
the fact that the nucleotide sequence of the PGR product 
was determined in the usual manner, it was confirmed that 
the strain which was named AHD-3 was a three point mutant 
having the mutated pyc gene in addition to the mutated horn 
gene and lysC gene . 

(5) Introduction of mutation, Ala213Thr, in zvf into three 
point mutant AHP-3 

The mutation, Ala213Thr, in zwf was introduced into 
the AHP-3 strain using the pCzwf458 produced in the above 
(2) by the gene replacement method described in Example 
2(2). PCR was carried out using chromosomal DNA of the 
resulting strain and, as the primer set, DNAs having the 
nucleotide sequences represented by SEQ ID NOS:7008 and 
7009 in the same manner as in Example 2(2). As a result of 
the fact that the nucleotide sequence of the PCR product 
was determined in the usual manner, it was confirmed that 
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the strain which was named APZ-4 was a four point mutant 
having the mutated zwf gene in addition to the mutated horn 
gene, lysC gene and pyc gene. 



(6) Lysine production test on HD-1, AHD-2, AHP-3 and APZ-4 
strains 

The HD-1, AHD-2 , AHP-3 and APZ-4 strains obtained 
above were subjected to a culture test in a 5 1 jar 
fermenter in accordance with the method of Example 2(3). 

Table 3 shows the results. 

Table 3 

L-Lysine hydrochloride Productivity 
stra:Ln (g/1) (g/i/h) 

HD-1 8 0.3 

AHD-2 73 2.5 

AHP-3 80 2.8 

APZ-4 86 3.0 



Since the lysine-producing mutant B-6 strain which 
has been bred based on the random mutation and selection 
shows a productivity of less than 2.1 g/l/h, the APZ-4 
strain showing a high productivity of 3.0 g/l/h is useful 
in industry. 

(7) Lysine fermentation by APZ-4 strain at high temperature 
The APZ-4 strain, which had been reconstructed by 
introducing 4 effective mutations into the wild type strain, 
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was subjected to the culturing test in a 5 1 jar fermenter 
in the same manner as in Example 2 (3) , except that the 
culturing temperature was changed to 40°C. 

The results are shown in Table 4 . 



Table 4 


Temperature 
<°C) 


L-Lysine hydrochloride 
(g/D 


Productivi ty 
(g/l/h) 


32 


86 


3.0 


40 


95 


3.3 



As is apparent from the results shown in Table 4, 
the lysine hydrochloride titer and productivity in 
culturing at a high temperature of 40°C comparable to those 
at 32°C were obtained. In the mutated and bred lysine- 
producing B-6 strain constructed by repeating random 
mutation and selection, the growth and the lysine 
productivity are lowered at temperatures exceeding 34°C so 
that lysine fermentation cannot be carried out, whereas 
lysine fermentation can be carried out using the APZ-4 
strain at a high temperature of 40°C so that the load of 
cooling is greatly reduced and it is industrially useful. 
The lysine fermentation at high temperatures can be 
achieved by reflecting the high temperature adaptability 
inherently possessed by the wild type strain on the APZ-4 
strain. 
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As demonstrated in the reconstruction of the 
lysine-producing strain, the present invention provides a 
novel breeding method effective for eliminating the 
problems in the conventional mutants and acquiring 
industrially advantageous strains. This methodology which 
reconstitutes the production strain by reconstituting the 
effective mutation is an approach which is efficiently 
carried out using the nucleotide sequence information of 
the genome disclosed in the present invention, and its 
effectiveness was found for the first time in the present 
invention. 

Example 4 

Production of DNA microarray and use thereof 

A DNA microarray was produced based on the 
nucleotide sequence information of the ORF deduced from the 
full nucleotide sequences of Corynebacterlum g-lutamlcum 
ATCC 13032 using software, and genes of which expression is 
fluctuated depending on the carbon source during culturing 
were searched . 

(1) Production of DNA microarray 

Chromosomal DNA was prepared from CoryneJbacterium 
glutamicum ATCC 13032 by the method of Saito et al. 
(Blochem. Blophys . Acta, 72: 619 (1963)). Based on 24 
genes having the nucleotide sequences represented by SEQ ID 
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NOS:207, 3433, 281, 3435, 3439, 765, 3445, 1226, 1229, 3448, 
3451, 3453, 3455, 1743, 3470, 2132, 3476, 3477, 3485, 3488, 
3489, 3494, 3496, and 3497 from the ORFs shown in Table 1 
deduced from the full genome nucleotide sequence of 
Corynebacterium glutamlcum ATCC 13032 using software and 
the nucleotide sequence of rabbit globin gene (GenBank 
Accession No. V00882) used as an internal standard, oligo 
DNA primers for PCR amplification represented by SEQ ID 
NOS:7010 to 7059 targeting the nucleotide sequences of the 
genes were synthesized in a usual manner. 

As the oligo DNA primers used for the PCR, 
DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7010 and 7011 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 207, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7012 and 7013 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3433, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7014 and 7015 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO:281, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7016 and 7017 were used for the amplification of 
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the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3435 , 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7018 and 7019 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO:3439, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7020 and 7021 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 765, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7022 and 7023 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO:3445, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7024 and 7025 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 1226, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7026 and 7027 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO:1229, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7028 and 7029 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO:3448 / 
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DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7030 and 7031 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3451, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7032 and 7033 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3453, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7034 and 7035 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3455, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7036 and 7037 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO:1743, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7038 and 7039 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO:3470, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7040 and 7041 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 2132, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7042 and 7043 were used for the amplification of 
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the DNA having the nucleotide sequence represented by SEQ 
ID NO:3476, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7044 and 7045 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3477, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7046 and 7047 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3485, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7048 and 704 9 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO:3488 / 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7050 and 7051 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO:3489 / 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS : 7052 and 7053 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 34 94 , 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS: 7054 and 7055 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO:3496, 
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DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7056 and 7057 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 34 97 , and 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7058 and 7059 were used for the amplification of 
the DNA having the nucleotide sequence of the rabbit globin 
gene, 

as the respective primer set. 

The PCR was carried for 30 cycles with each cycle 
consisting of 15 seconds at 95°C and 3 minutes at 68°C 
using a thermal cycler (GeneAmp PCR system 9600, 
manufactured by Perkin Elmer) , TaKaRa EX-Taq (manufactured 
by Takara Shuzo) , 100 ng of the chromosomal DNA and the 
buffer attached to the TaKaRa Ex-Taq reagent. In the case 
of the rabbit globin gene, a single-stranded cDNA which had 
been synthesized from rabbit globin mRNA (manufactured by 
Life Technologies) according to the manufacture's 
instructions using a reverse transcriptase RAV-2 
(manufactured by Takara Shuzo) . The PCR product of each 
gene thus amplified was subjected to agarose gel 
electrophoresis and extracted and purified using QIAquick 
Gel Extraction Kit (manufactured by QIAGEN) . The purified 
PCR product was concentrated by precipitating it with 
ethanol and adjusted to a concentration of 200 ng/|j.l . Each 
PCR product was spotted on a slide glass plate 
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(manufactured by Matsunami Glass) having MAS coating in 2 
runs using GTMASS SYSTEM (manufactured by Nippon Laser & 
Electronics Lab.) according to the manufacture's 
instructions . 

(2) Synthesis of fluorescence labeled cDNA 

The ATCC 13032 strain was spread on BY agar medium 
(medium prepared by adding 20 g of peptone (manufactured by 
Kyokuto Pharmaceutical) , 5 g of yeast extract (manufactured 
by Difco) , and 16 g of Bactoagar (manufactured by Difco) to 
in 1 liter of water and adjusting its pH to 7.2) and 
cultured at 30°C for 2 days. Then, the cultured strain was 
further inoculated into 5 ml of BY liquid medium and 
cultured at 30°C overnight. Then, the cultured strain was 
further inoculated into 30 ml of a minimum medium (medium 
prepared by adding 5 g of ammonium sulfate, 5 g of urea, 
0.5 g of monopotassium dihydrogenphosphate , 0 . 5 g of 
dipotassium monohydrogenphosphate , 20.9 g of 

morpholinopropanesulf onic acid, 0 . 25 g of magnesium sulfate 
heptahydrate , 10 mg of calcium chloride dihydrate, 10 mg of 
manganese sulfate monohydrate, 10 mg of ferrous sulfate 
heptahydrate, 1 mg of zinc sulfate heptahydrate, 0.2 mg 
copper sulfate, and 0.2 mg biotin to 1 liter of water, and 
adjusting its pH to 6.5) containing 110 mmol/1 glucose or 
200 mmol/1 ammonium acetate, and cultured in an Erlenmyer 
flask at 30° to give 1.0 of absorbance at 660 nm. After 



- 342 - 



the cells were prepared by centrifuging at 4°C and 5,000 
rpm for 10 minutes, total RNA was prepared from the 
resulting cells according to the method of Bormann et al. 
(Molecular- Microbiology, 6: 317-326 (1992)). To avoid 
contamination with DNA, the RNA was treated with Dnasel 
(manufactured by Takara Shuzo) at 37°C for 30 minutes and 
then further purified using Qiagen RNeasy MiniKit 
(manufactured by QIAGEN) according to the manufacture's 
instructions. To 30 |ig of the resulting total RNA, 0.6 |j.l 
of rabbit globin mRNA (50 ng/|il , manufactured by Life 
Technologies) and 1 [il of a random 6 mer primer (500 ng/p.1, 
manufactured by Takara Shuzo) were added for denaturing at 
65°C for 10 minutes, followed by quenching on ice. To the 
resulting solution, 6 til of a buffer attached to 
Superscript II (manufactured by Lif etechnologies) , 3 |al of 
0.1 mol/1 DTT, 1.5 ^il of dNTPs (25 mmol/1 dATP, 25 mmol/1 
dCTP, 25 mmol/1 dGTP, 10 mmol/1 dTTP) , 1.5 |il of Cy5-dUTP 
or Cy3-dUTP (manufactured by NEN) and 2 |il of Superscript 
II were added, and allowed to stand at 25°C for 10 minutes 
and then at 42°C for 110 minutes. The RNA extracted from 
the cells using glucose as the carbon source and the RNA 
extracted from the cells using ammonium acetate were 
labeled with Cy5-dUTP and Cy3-dUTP, respectively. After 
the fluorescence labeling reaction, the RNA was digested by 
adding 1.5 (j,l of 1 mol/1 sodium hydroxide-20 mmol/1 EDTA 
solution and 3.0 |j.l of 10% SDS solution, and allowed to 
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stand at 65°C for 10 minutes. The two cDNA solutions after 
the labeling were mixed and purified using Qiagen PCR 
purification Kit (manufactured by QIAGEN) according to the 
manufacture's instructions to give a volume of 10 n.1 . 

(3) Hybridization 

UltraHyb (110 p.1) (manufactured by Ambion) and the 
fluorescence-labeled cDNA solution (10 |j.l) were mixed and 
subjected to hybridization and the subsequent washing of 
slide glass using GeneTAC Hybridization Station 
(manufactured by Genomic Solutions) according to the 
manufacture 1 s instructions. The hybridization was carried 
out at 50°C, and the washing was carried out at 25°C. 

(4) Fluorescence analysis 

The fluorescence amount of each DNA array having the 
fluorescent cDNA hybridized therewith was measured using 
ScanArray 4 000 (manufactured by GSI Lumonics) . 

Table 5 shows the Cy3 and Cy5 signal intensities of 
the genes having been corrected on the basis of the data of 
the rabbit globin used as the internal standard and the 
Cy3/Cy5 ratios . 
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Table 5 



SEQ ID NO 




Cv3 


intensity 


Cy5 


intensity 


Cy3/Cy5 


207 






5248 






3240 




1.62 


3433 






2239 






2694 




0.83 


281 






2370 






2595 




0.91 


3435 






2566 






2515 




1.02 


3439 






5597 






6944 




0 . 81 


765 






6134 






4943 




1 .24 


3455 






1169 






1284 




0 . 91 


1226 






1301 






1493 




0 . 87 


1229 






1168 






1131 




1 . 03 


3448 






1187 






1594 




0 . 74 


3451 






2845 






3859 




0 . 74 


3453 






3498 






1705 




2 . 05 


3455 






1491 






1144 




1 .30 


1743 






1972 






1841 




1 . 07 


3470 






4752 






3764 




1 .26 








1173 






1085 




1 . 08 


34 7 6 






1847 






1420 




1 . 30 


3477 






1284 






1164 




1 . 10 


3485 






4539 






8014 




0.57 


3488 






34289 






1398 




24 .52 


348Q 






43645 






1497 




29. 16 


3494 






3199 






2503 




1 .28 


3496 






3428 






2364 




1.45 


3497 






3848 






3358 




1 . 15 


The 


ORF 


function 


data 


estimated 


by 


using software 


were searched 


for SEQ 


ID 


NOS : 


3488 


and 


3489 showing 


remarkably strong 


Cy3 signals. 


As a result 


, it was found 


that SEQ ID 


NOS:34 88 and 


3489 


are 


a maleate 


synthase gene 


and an i soci tra te 


lyase 


gene, 


respectively . 


It is known 



that these genes are transcriptionally induced by acetic 
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acid in Coryneba c terl urn glutamicum (Archives of 
Microbiology, 168: 262-269 (1997)). 

As described above, a gene of which expression is 
fluctuates could be discovered by synthesizing appropriate 
oligo DNA primers based on the ORF nucleotide sequence 
information deduced from the full genomic nucleotide 
sequence information of Corynebacterium glutamicum ATCC 
13032 using software, amplifying the nucleotide sequences 
of the gene using the genome DNA of CoryneJbacterium 
glutamicum as a template in the PCR reaction, and thus 
producing and using a DNA microarray. 

This Example shows that the expression amount can 
be analyzed using a DNA microarray in the 24 genes. On the 
other hand, the present DNA microarray techniques make it 
possible to prepare DNA mi cr oar rays having thereon several 
thousand gene probes at once. Accordingly, it is also 
possible to prepare DNA microarrays having thereon all of 
the ORF gene probes deduced from the full genomic 
nucleotide sequence of Corynebacterium glutamicum ATCC 
13032 determined by the present invention, and analyze the 
expression profile at the total gene level of 
Corynebacterium glutamicum using these arrays . 
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Example 5 

Homology search using Coryneha c terl um glutamlcum genome 
sequence 

(1) Search of adenosine deaminase 

The amino acid sequence (ADD_ECOLI) of Escherichia 
coll adenosine deaminase was obtained from Swiss-prot 
Database as the amino acid sequence of the protein of which 
function had been confirmed as adenosine deaminase 
(EC3.5.4.4). By using the full length of this amino acid 
sequence as a query, a homology search was carried out on a 
nucleotide sequence database of the genome sequence of 
CoryneJbacterium glutamlcum or a database of the amino acids 
in the ORF region deduced from the genome sequence using 
FASTA program (Proc. Natl. Acad. Scl. ISA, 85: 2444-2448 
(1988)). A case where E-value was le" 10 or less was judged 
as being significantly homologous. As a result , no 
sequence significantly homologous with the Escherichia coll 
adenosine deaminase was found in the nucleotide sequence 
database of the genome sequence of Corynehacterlum 
glutamlcum or the database of the amino acid sequences in 
the ORF region deduced from the genome sequence. Based on 
these results, it is assumed that Corynehacterlum 
glutamlcum contains no ORF having adenosine deaminase 
activity and thus has no activity of converting adenosine 
into inosine. 
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(2) Search of glycine cleavage enzyme 

The sequences (GCSP_ECOLI , GCST_ECOLI and 
GCSH_ECOLI) of glycine decarboxylase, aminomethyl 
transferase and an aminomethyl group carrier each of which 
is a component of Escherichia, coll glycine cleavage enzyme 
as the amino acid sequence of the protein, of which 
function had been confirmed as glycine cleavage enzyme 
(EC2 . 1 .2 . 10) , were obtained from Swiss-prot Database. 

By using these full-length amino acid sequences as 
a query, a homology search was carried out on a nucleotide 
sequence database of the genome sequence of Corynebacterlum 
glutamlcvm or a database of the ORF amino acid sequences 
deduced from the genome sequence using FASTA program. A 
ease where E-value was le~ 10 or less was judged as being 
significantly homologous. As a result, no sequence 
significantly homologous with the glycine decarboxylase, 
the aminomethyl transferase or the aminomethyl group 
carrier each of which is a component of Escherichia coll 
glycine cleavage enzyme, was found in the nucleotide 
sequence database of the genome sequence of Corynehacterlum 
glutamlcum or the database of the ORF amino acid sequences 
estimated from the genome sequence. Based on these results, 
it is assumed that Coirynehacterlum glutamlcum contains no 
ORF having the activity of glycine decarboxylase, 
aminomethyl transferase or the aminomethyl group carrier 
and thus has no activity of the glycine cleavage enzyme. 
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(3) Search of IMP dehydrogenase 

The Amino acid sequence (IMDH ECOLI) of Escherichia 
coll IMP dehydrogenase as the amino acid sequence of the 
protein, of which function had been confirmed as IMP 
dehydrogenase (EC1 .1.1 .205) , was obtained from Swiss-prot 
Database. By using the full length of this amino acid 
sequence as a querv, a homology search was carried out on a 
nucleotide sequence database of the genome sequence of 
Corynehacteirlum g-lwcamlcum or a database of the ORF amino 
acid sequences predicted from the genome sequence using 
FASTA program. A case where E -value was le" 10 or less was 
judged as being significantly homologous. As a result, the 
amino acid sequences eAcoded by two ORFs, namely, an ORF 
positioned in the regioh of the nucleotide sequence No. 
615336 to 616853 (or ORK having the nucleotide sequence 
represented by SEQ ID NO: 672) and another ORF positioned in 
the region of the nucleotide sequence No. 616973 to 618094 
(or ORF having the nucleotide sequence represented by SEQ 
ID NO: 674) were signif i can tly\ homologous with the ORFs of 
Escherichia coll IMP dehydrogenase. By using the above- 
described predicted amino acid stequence as a query in order 
to examine the similarity of uhe amino acid sequences 
encoded by the ORFs with IMP dehydrogenases of other 
organisms in greater detail, a search was carried out on 
GenBank (http://www.ncbi.nlm.nih.goAy) nr-aa database 
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(amino acid sequence database constructed on the basis of 
GenBankCDfe translation products, PDB database, Swiss-Prot 
database, \ PIR database, PRF database by eliminating 
duplicated \registratiohs) using BLAST program. As a result, 
both of th4 two amino acid sequences showed significant 
homologies With IMP dehdyrogenases of other organisms and 
clearly higheV homologies with IMP dehdyrogenases than with 
amino acid sequences of other proteins, and thus, it was 
assumed that \ the two ORFs would function as IMP 
dehydrogenase. \Based on these results, it was therefore 
assumed that CorwieJbacterium glutamlcum has two ORFs having 
the IMP dehydrogenase activity. 



Example 6 

Proteome analysis of proteins derived from Cojcynebacterlum 
glutamlcum 

(1) Preparations of proteins derived from Corynehacterlum 
glutamlcum ATCC 13032, FERM BP-7134 and FERM BP-158 

Culturing tests of Corynehacterlum glutamlcum ATCC 
13032 (wild type strain) , Corynehacterlum glutamlcum FERM 
BP-7134 (lysine-producing strain) and Corynehacterlum 
glutamlcum (FERM BP-158, lysine-highly producing strain) 
were carried out in a 5 1 jar fermenter according to the 
method in Example 2(3) . The results are shown in Table 6. 
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Table 6 



Strain 


L-Lysine yield (g/1) 


ATCC 13032 


0 


FERM BP-7134 


45 


FERM BP-158 


60 



After culturing, cells of each strain were 
recovered by centrifugation. These cells were washed with 
Tris-HCl buffer (10 mmol/1 Tris-HCl, pH 6.5, 1.6 mg/ml 
protease inhibitor (COMPLETE; manufactured by Boehringer 
Mannheim) ) three times to give washed cells which could be 
stored under freezing at -80°C. The f reeze-stored cells 
were thawed before use, and used as washed cells. 

The washed cells described above were suspended in 
a disruption buffer (10 mmol/1 Tris-HCl, pH 7.4, 5 mmol/1 
magnesium chloride, 50 mg/1 RNase, 1.6 mg/ml protease 
inhibitor (COMPLETE: manufactured by Boehringer Mannheim)), 
and disrupted with a disruptor (manufactured by Brown) 
under cooling. To the resulting disruption solution, DNase 
was added to give a concentration of 50 mg/1, and allowed 
to stand on ice for 10 minutes. The solution was 
centrifuged (5,000 x g, 15 minutes, 4°C) to remove the 
undisrupted cells as the precipitate, and the supernatant 
was recovered. 

To the supernatant, urea was added to give a 
concentration of 9 mol/1, and an equivalent amount of a 
lysis buffer (9.5 mol/1 urea, 2% NP-40, 2% Ampholine, 5% 
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mercaptoethanol , 1.6 mg/ml protease inhibitor (COMPLETE; 
manufactured by Boehringer Mannheim) was added thereto, 
followed by thoroughly stirring at room temperature for 
dissolving. 

After being dissolved, the solution was centrifuged 
at 12,000 x g for 15 minutes, and the supernatant was 
recovered . 

To the supernatant, ammonium sulfate was added to 
the extent of 80% saturation, followed by thoroughly 
stirring for dissolving. 

After being dissolved, the solution was centrifuged 
(16,000 x g, 20 minutes, 4°C) , and the precipitate was 
recovered. This precipitate was dissolved in the lysis 
buffer again and used in the subsequent procedures as a 
protein sample. The protein concentration of this sample 
was determined by the method for quantifying protein of 
Bradford. 

(2) Separation of protein by two dimensional 
electrophoresis 

The first dimensional electrophoresis was carried 
out as described below by the isoelectric electrophoresis 
method. 

A molded dry IPG strip gel (pH 4-7, 13 cm, 
Immobiline DryStrips; manufactured by Amersham Pharmacia 
Biotech) was set in an electrophoretic apparatus (Multiphor 
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II or IPGphor; manufactured by Amersham Pharmacia Biotech) 
and a swelling solution (8 mol/1 urea, 0.5% Triton X-100, 
0.6% dithiothreitol , 0;5% Ampholine, pH 3-10) was packed 
therein, and the gel was allowed to stand for swelling 12 
to 16 hours . 

The protein sample prepared above was dissolved in 
a sample solution (9 mol/1 urea, 2% CHAPS, 1% 
dithiothreitol, 2% Ampholine, pH 3-10), and then about 100 
to 500 \xg (in terms of protein) portions thereof were taken 
and added to the swollen IPG strip gel . 

The electrophoresis was carried out in the 4 steps 
as defined below under controlling the temperature to 20°C: 
step 1: 1 hour under a gradient mode of 0 to 500V; 
step 2: 1 hour under a gradient mode of 500 to 1,000 V; 
step 3: 4 hours under a gradient mode of 1,000 to 8,000 V; 
and 

step 4: 1 hour at a constant voltage of 8,000 V. 

After the isoelectric electrophoresis, the IPG 
strip gel was put-off .from the holder and soaked in an 
equilibration buffer A <50 ramol/1 Tris-HCl, pH 6.8, 30% 
glycerol, 1% SDS, 0.25% dithiothreitol) for 15 minutes and 
another equilibration bufffer B (50 mmol/1 Tris-HCl, pH 6.8, 
6 mol/1 urea, 30% glycerol, 1% SDS, 0.45% iodo acetamide) 
for 15 minutes to sufficiently equilibrate the gel. 

After the equilibrium, the IPG strip gel was 
lightly rinsed in an SDS electrophoresis buffer (1.4% 

» ** 
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glycine, 0.1% SDS, 0.3% Tris-HCl, pH 8.5), and the second 
dimensional electrophoresis depending on molecular weight 
was carried out as described below to separate the proteins . 

Specifically, the above IPG strip gel was closely 
placed on 14% polyacryl amide slub gel (14% polyacrylamide, 
0.37% bisacrylamide, 37.5 mmol/1 Tris-HCl, pH 8.8, 0.1% SDS, 
0.1% TEMED, 0.1% ammonium persulfate) and subjected to 
electrophoresis under a constant voltage of 30 mA at 20°C 
for 3 hours to separate the proteins . 

(3) Detection of protein spot 

Coomassie staining was performed by the method of 
Gorg et al. (Electrophoresis, 9: 531-546 (1988)) for the 
slub gel after the second dimensional electrophoresis. 
Specifically, the slub gel was stained under shaking at 
25°C for about 3 hours, the excessive coloration was 
removed with a decoloring solution, and the gel was 
thoroughly washed with distilled water. 

The results are shown in Fig. 2. The proteins 
derived from the ATCC 13032 strain (Fig. 2A) , FERM BP-7134 
strain (Fig. 2B) and FERM BP-158 strain (Fig. 2C) could be 
separated and detected as spots . 

(4) In-gel digestion of detected protein spot 

The detected spots were each cut out from the gel 
and transferred into siliconized tube, and 4 00 \xl of 100 
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mmol/1 ammonium bicarbonate : acetonitrile solution (1:1, 
v/v) was added thereto, followed by shaking overnight and 
freeze-dried as such. To the dried gel, 10 |il of a 
lysylendopeptidase (LysC) solution (manufactured by WAKO, 
prepared with 0.1% SDS-containing 50 mmol/1 ammonium 
bicarbonate to give a concentration of 100 ng/|j,l) was added 
and the gel was allowed to stand for swelling at 0°C for 45 
minutes, and then allowed to stand at 37°C for 16 hours. 
After removing the LysC solution, 20 p,l of an extracting 
solution (a mixture of 60% acetonitrile and 5% formic acid) 
was added, followed by ultrasoni cation at room temperature 
for 5 minutes to disrupt the gel. After the disruption, 
the extract was recovered by centrif ugation (12,000 rpm, 5 
minutes, room temperature). This operation was repeated 
twice to recover the whole extract. The recovered extract 
was concentrated by centrif ugation In vacuo to halve the 
liquid volume. To the concentrate, 20 |il of 0.1% 

trif luoroacetic acid was added, followed by thoroughly 
stirring, and the mixture was subjected to desalting using 
ZipTip (manufactured by Millipore) . The protein absorbed 
on the carriers of ZipTip was eluted with 5 p.1 of a-cyano- 
4-hydroxycinnamic acid for use as a sample solution for 
analysis . 
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(5) Mass spectrometry and amino acid sequence analysis of 
protein spot with matrix assisted laser desorption 
ionization time of flight mass spectrometer (MALDI-TOFMS) 

The sample solution for analysis was mixed in the 
equivalent amount with a solution of a peptide mixture for 
mass calibration (300 nmol/1 Angiotensin II, 300 nmol/1 
Neurotensin, 150 nmol/1 ACTHclip 18-39, 2.3 nmol/1 bovine 
insulin B chain) , and 1 \il of the obtained solution was 
spotted on a stainless probe and crystallized by 
spontaneously drying. 

As measurement instruments, REFLEX MALDI-TOF mass 
spectrometer (manufactured by Bruker) and an N2 laser (337 
nm) were used in combination. 

The analysis by PMF (peptide-mass finger printing) 
was carried out using integration spectra data obtained by 
measuring 30 times at an accelerated voltage of 19.0 kV and 
a detector voltage of 1.50 kV under reflector mode 
conditions. Mass calibration was carried out by the 
internal standard method. 

The PSD (post-source decay) analysis was carried 
out using integration spectra obtained by successively 
altering the reflection voltage and the detector voltage at 
an accelerated voltage of 27.5 kV. 

The masses and amino acid sequences of the peptide 
fragments derived from the protein spot after digestion 
were thus determined. 
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(6) Identification of protein spot 

From the amino acid sequence information of the 
digested peptide fragments derived from the protein spot 
obtained in the above (5) , ORFs corresponding to the 
protein were searched on the genome sequence database of 
Cozrynebacterlvim glutamlcum ATCC 13032 as constructed in 
Example 1 to identify the protein. 

The identification of the protein was carried out 
using MS-Fit program and MS-Tag program of intranet protein 
prospector . 

(a) Search and identification of gene encoding high- 
expression protein 

In the proteins derived from Coxrynebactexrlum 
grlutamicum ATCC 13032 showing high expression amounts in 
CBB-staining shown in Fig. 2A, the proteins corresponding 
to Spots-1, 2, 3, 4 and 5 were identified by the above 
method . 

As a result, it was found that Spot-1 corresponded 
to enolase which was a protein having the amino acid 
sequence of SEQ ID NO: 4585; Spot-2 corresponded to 
phosphoglycelate kinase which was a protein having the 
amino acid sequence of SEQ ID NO: 5254; Spot-3 corresponded 
to glyceraldehyde-3-phosphate dehydrogenase which was a 
protein having the amino acid sequence represented by SEQ 
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# # 



ID NO: 5255 ; Spot-4 corresponded to fructose bis~phosphate 
aldolase which was a protein having the amino acid sequence 
represented by SEQ ID NO:6S43; and Spot-5 corresponded to 
triose phosphate isomerase which was a protein having the 
amino acid sequence represented by SEQ ID NO: 5252. 

These genes, represented by SEQ ID NOS :108s, 1754, 
1775, 3043 and 1752 encoding the proteins corresponding to 
Spots-1, 2, 3, 4 and 5, respectively, encoding the known 
Proteins are important in the central metabolic pathway for 
maintaining ^ life of the mLesoorgma±mm Particularly , 
it is suggested that the genes of Spots-2, 3 and 5 form an 
operon and a high-expression promoter is encoded in the 
upstream thereof (J", of Bacterid. , 174 : 6067-6086 (1992)). 

Also, the protein corresponding to Spot-9 in Fig. 2 
was identified in the same manner as described above, and 
it was found that Spot-9. was an elongation factor Tu which 
was a protein having the ammo acid sequence represented by 
SEQ id NO: 6937, and that the protein was encoded by DNA 
having the nucleotide sequence represented by SEQ ID 



NO: 34 37. 



Based on these results, the proteins having high 
expression level were identified by proteome analysis using 
the genome sequence database of Corynebacterium glutamics 
constructed in Example 1. - Thus, the nucleotide sequences 
of the genes encoding the proteins and the nucleotide 
sequences upstream thereof could be searched simultaneously. 
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Accordingly, it is shown, that nucleotide sequences having a 
function as a high -expression promoter can be efficiently 
selected. 

(b) Search and identification of modified protein 

Among the proteins derived from Corynebacterlum 
glutamxcvm FERM BP-7134 shown in Fig. 2B, Spots-6, 7 and 8 
were identified by the above method. As a result, these 
three spots all corresponded to catalase which was a 
protein having the amino acid sequence represented by SEQ 
ID NO: 3785. 

Accordingly, all of Spots-6, 7 and 8 detected as 
spots differing in isoelectric mobility were all products 
derived from a catalase gene having the nucleotide sequence 
represented by SEQ ID NO:285. Accordingly, it is shown 
that the catalase derived from Corynebacterium glu tamiavm 
FEHM BP-7134 was modified after the translation . 

Based on these results, it is confirmed that 
various modified proteins- can be efficiently searched by 
proteome analysis using -the genome sequence database of 
Cozynebac terium glut ami cum constructed in Example 1 . 

(c) Search and identification of expressed protein 
effective in lysine production 

It was found out that in Fig. 2A (ATCC 13032: wild 
type strain), Fig. 2B ' fFERM BP-7134: * lysine-producing 
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strain) and Fig. 2C (FERM BP-158: lysine-highly producing 
strain) , the catalase corresponding to Spot- 8 and the 
elongation factor Tu corresponding to Spot-? as identified 
above showed the higher expression level with an increase 
in the lysine productivity. 

Based on these results, it was found that hopeful 
mutated proteins can be efficiently searched and identified 
in breeding aiming at strengthening the productivity of a 
target product by the proteome analysis using the genome 
sequence database of Corynebacterium g-lutami aum constructed 
in Example 1 . 

Moreover, useful mutation points of useful mutants 
can be easily specified by searching the nucleotide 
sequences (nucleotide sequences of promoter, ORF, or the 
like) relating to the identified proteins using the above 
database and using primers designed on the basis of the 
sequences. As a result of the fact that the mutation 
points are specified/ industrially useful mutants which 
have the useful mutations or other useful mutations derived 
therefrom can be easily bred. 

While the invention has been described in detail 
and with reference, to specific embodiments thereof, it will 
be apparent to one of skill in the art that various changes 
and modifications can be made therein without departing 
from the spirit and scope thereof. All references cited 
herein are incorporated in i^ieir entirety. 
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